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PREFACE 


The rapid explosion of knowledge relating recombinant DNA tech¬ 
nology to science and medicine has laid the groundwork for its applica¬ 
tion to diagnosis of genetic disease, gene regulation, development of 
transgenic animals, and gene therapy. The science of molecular genet¬ 
ics has progressed from an understanding of the basics to manipula¬ 
tions of genes using specific plasmid, viral or shuttle vectors, jet injec¬ 
tion, and particle bombardment to introduce reporter genes and 
transfect a variety of microbial, plant, animal, and human cells. These 
techniques are now being extended to map the genome of humans and 
other species. Early studies focused on developing the linkage map, 
and parallel goals include answers to questions about how genes are 
regulated and how they interact to control physiology and develop¬ 
ment. These questions involve studying individual genes, as well as 
sets of genes—the study of which is now termed “genomics.” 

Recent experiments with transgenic mice have led to the prediction 
that animal agriculture would undergo an immediate revolution. One 
could envision “super cows, pigs, and other species,” which carry trans¬ 
genes that would make them particularly productive and healthy. 
There remains a large knowledge gap, however, in applying this tech¬ 
nology in a scientific and ethically acceptable manner to farm animals 
and other species. The latest breakthrough whereby the lamb “Dolly” 
was successfully cloned from the cells of an adult sheep has catapulted 
these issues to the forefront of world debate. 

Recent publications of specific genetic defects of large animals have 
demonstrated the applicability of molecular genetics to the improve¬ 
ment of animal health. Examples are the studies of the leukocyte adhe¬ 
sion deficiency (LAD) of Holstein cattle; hyperkalemic periodic paraly¬ 
sis (HPP) of quarterhorses; malignant hyperthermia of swine; and 
canine progressive retinal atrophy, copper storage disease, pyruvate 
kinase deficiency, phosphofructokinase deficiency, and von Wille- 
brand’s disease (vWD). 

Development of genetic techniques including sequencing of specific 
normal and mutant genes has permitted positive identification of car¬ 
rier animals for effective genetic screening. For example, the frequency 


XI 



PREFACE 


• • 
xn 

of carriers for bovine LAD varies from 6 to 15%, whereas equine HPP 
has a major impact on the genotype of the quarterhorse. Similarly, 
breeds of dogs that carry a high freqeney of vWD can be affected to the 
extent of 15 to 80% of current bloodlines. 

Not all genetic defects have undesirable consequences, for evolution 
of some abnormalities may have a potential beneficial effect in the 
carrier state. Very little is currrently known about the potential physi¬ 
ological or other benefits associated with certain genotypes and their 
observed phenotype. Identification of genetic markers permits their 
selection to segregate economically important animals for beneficial 
traits and allows for selective breeding and even positional cloning of 
the genes involved. The future for marker-assisted selection of the 
genetic loci responsible for economically valuable traits is potentially 
large and can involve a variety of domestic farm animals including 
cattle, pigs, and chickens. How these tools will be used appropriately 
and responsibly by the animal industry has yet to be determined. 

Application of these techniques to the diagnosis and control of infec¬ 
tious disease is also important. For example, understanding the genet¬ 
ic basis of various animal and human retroviruses leads to ways to 
control their respective diseases. Recent work on studying the equine 
infectious anemia virus indicated that exchange of the envelope region 
of a highly virulent strain of the virus with the same region of an 
avirulent viral strain can make the avirulent strain now become viru¬ 
lent. This research has opened a wide range of possibilities for viral 
manipulation in controlling the disease in infected horses and also for 
the development of other antiinflammatory and antipyretic drugs for 
therapy of the disease. 

These recent developments in biotechnology have led to an increas¬ 
ing plethora of related patents. The idea of patenting genes, partic¬ 
ularly human genes, has raised a series of ethical dilemmas. While 
patents have become crucial to the biotechnology business, scientists 
and businessmen are concerned about how patents they file for por¬ 
tions of a gene (i.e., copies of gene fragments) could potentially prevent 
patenting the useful genes themselves. In other words, having a patent 
on a part of a gene could prevent the patenting of discoveries that 
affect the whole gene. Thus, the actual useful genes or gene products 
may now be protected by patents. A more utilitarian approach could be 
adopted in order to solve this problem plaguing the commercial bio¬ 
technology industry. Linked hand-in-hand with the commercial con¬ 
cerns are the ethical problems of holding a monopoly on patented tech¬ 
nology which could provide lifesaving therapeutic products. The 
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ethical issues are particularly relevant to the application of this tech¬ 
nology to veterinary medicine. 

The present volume addresses various perspectives on these exciting 
and challenging times. We thank the authors Jens Hauge, Janelle 
Cortner, Susan Vande Woude, George F. Vande Woude, George H. 
Sack, Johannes Walter, Katherine A. High, Frederick J. Fuller, Cath- 
ryn S. Mellersh, and Elaine A. Ostrander for their scholarly contribu¬ 
tions. 

W. Jean Dodds 
James E. Womack 
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I. Introduction to Molecular Genetics 

During the past 20 years, an immense expansion has taken place in 
our understanding of the structure, organization, and regulation of 
genes. This expansion has, to a large extent, been the result of the 
development of new methods to manipulate, multiply, and study DNA 
fragments. The same new methods have a number of important appli¬ 
cations in human medicine and, gradually, also in veterinary medicine. 
For good overviews, see Old and Primrose (1990) and Watson et al. y 
(1992). An agricultural perspective is given in the book edited by Hans¬ 
el and Weir (1990). These surveys also discuss the large impact on 
microbiological diagnosis and vaccine production, aspects that are not 
addressed here. 

The development of gene technology rests on the foundations of mo¬ 
lecular biology laid during the preceding 30 years, with the discovery of 
the double helical structure if DNA and the breaking of the genetic 
code as the major milestones. I will start this chapter, therefore, with a 
brief review of the molecular biological basis of gene technology. 

A. Structure and Replication of Genes 

The notion that heredity was linked to matter, to certain cellular 
structures, appeared for the first time in 1903, when the chromosomal 
theory of heredity was formulated. Chromosomes were recognized to 
have properties that could accommodate the hereditary units discov¬ 
ered by Mendel in his experiments. A bridge was built between cytol¬ 
ogy and genetics. But the nature of the molecules that carried the 
hereditary information was not thereby identified. Many researchers 
believed that chromosomal protein was the most likely candidate. In 
1944, however, Avery and co-workers showed convincingly that genes 
in pneumococcal bacteria are molecules of deoxyribonucleic acid 
(DNA). It soon became apparent that pneumococci were no exception 
in this respect. In animals and plants, as well as in bacteria, hereditary 
information is carried by DNA. 

Figure la shows a section of a DNA molecule. DNA is built in the 
form of a chain; phosphoric acid and the sugar deoxyribose alternate as 
the links in this chain. Phosphoric acid always connects the 5'-OH 
group in one deoxyribose to the 3'-OH in the next deoxyribose. The 
heterocyclic bases adenine, guanine, cytosine, and thymine are bound 
to the I' carbons in deoxyribose. The base-deoxyribose moiety is a 
nucleoside, and the repeating unit in the chain, the combination base- 
deoxyribonucleoside-phosphoric acid, a nucleotide. 



-o—P=0 



Fig. 1. (a) Structure of part of a DNA chain. (From BIOCHEMISTRY, 2E by Lubert Stryer. Copyright © 1975, 1981. Reprinted 
with permission of W. H. Freeman and Company.) (b) DNA double helix during replication. (Reprinted by permission from Jones 
and Bartlett Publishers, from Molecular Biology by David Freifelder, 1983, p. 254.) 
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It is impractical to write DNA structures with all the atoms, and 
simplified writings have come into use. If the bases from the top on 
down in Fig. la are adenine, thymine, and guanine, the most-used 
abbreviation is simply ATG. One uses the first letter in the name of the 
base to designate the corresponding nucleotide. 

That such chainlike DNA molecules could carry hereditary informa¬ 
tion, genes, was not difficult to conceive. Most genes determine the 
construction of proteins, which also are chainlike molecules, with ami¬ 
no acids making up the links in the chain. One needs only to postulate 
a one-to-one relationship between an amino acid and a group of DNA 
bases. The other fundamental property of the hereditary material, that 
it is accurately duplicated and distributed on both daughter cells when 
cells divide, was harder to visualize on the basis of the DNA structure, 
as shown in Fig. la. Watson felt that the solution to the problem per¬ 
haps lay hidden in the three-dimensional structure of DNA and, to¬ 
gether with Crick, he set out to elucidate this structure. The result was 
the discovery in 1953 of the double-helix structure of DNA. The double 
helix is built as shown schematically in Fig. lb. A, T, G, and C here 
represent the bases, which point toward each other and are bound to 
each other through hydrogen bonds. The DNA strands of the double 
helix have opposite directions (a DNA chain has a 5' end and a 3' end; 
see Fig. la). The size of the bases and their hydrogen bonding proper¬ 
ties make only two pairings possible, namely, A with T and G with C. 
One turn of the standard double helix has 10 base pairs (bp) and covers 
a length of 3.4 x 10 ~ 6 mm. The diameter of the double helix is 2 X 10 -6 
mm. It is so slim a structure that if a DNA molecule could reach from 
the Earth to the moon, it would weight only 1 mg! 

The model for the DNA molecule that Watson and Crick arrived at 
can in a simple way explain how DNA molecules are duplicated, i.e., 
the process of DNA replication. When we consider the double helix 
(Fig. lb), we see that the two strands correspond to each other. The 
base-pairing rules determine which base in one strand must be oppo¬ 
site a given base in the other strand. The strands are said to be comple¬ 
mentary ’ When replication starts, the hydrogen bonds between the two 
strands are gradually broken (Fig. lb). New DNA chains are con¬ 
structed from the four deoxyribonucleoside triphosphates with the 
help of the enzyme DNA polymerase, one molecule of pyrophosphate 
being eliminated per nucleotide laid down. Step by step, two new 
strands, complementary to the original strands, are synthesized. The 
original strands serve as templates in the process, and the result is two 
new double helixes, identical to each other and to the original DNA 
molecule. 
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The basic mechanism for DNA replication is simple and elegant. The 
practical execution of DNA replication in the cells, however, involves a 
large number of proteins, including some extra enzymes. The unwind¬ 
ing of the DNA molecules causes problems, as does the fact that DNA 
polymerase lays down a new strand from the 5' end toward the 3' end 
only. Furthermore, for this synthesis to start, a primer nucleic acid is 
needed for the strand to grow from. One important auxiliary enzyme is 
DNA ligase, which links together DNA molecules. 

If DNA is heated to 80-90°C, the molecule will denature and the 
strands separate. If the temperature is lowered somewhat, the comple¬ 
mentary strands will find each other again and re-form the double 
helix. This property is widely exploited in gene technology (see Section 

II,B,D. 


B. From Gene to Gene Product 

When the information stored in DNA is used, organisms use another 
type of nucleic acid, ribonucleic acid (RNA). RNA differs from DNA in 
that it contains ribose instead of deoxyribose, and uracil instead of 
thymine. Uracil has the same base-pairing properties as thymine. 
RNA does not normally exist in double helix form, but RNA may form 
short stretches of double helix when the RNA strand folds back on 
itself locally. A DNA strand and a complementary RNA strand will 
form a double helix. 

RNA is made by transcription from one of the DNA strands in a 
double helix (Fig. 2). The DNA strand is used as a template in a process 
similar to DNA replication. The DNA double helix is opened locally, 
and an RNA strand complementary to the template is synthesized with 
help from the enzyme RNA polymerase, ribonucleoside triphosphates 
being used as activated building blocks. Special sequences of bases in 
front of the genes (promoters) are recognized by the polymerase, mak¬ 
ing it start at the right place and on the right DNA strand. At the other 
end of the genes, there are termination signals. 

In all cells, there are three main types of RNA: messenger RNA, 
ribosomal RNA, and transfer RNA. Messenger RNA (mRNA) carries 
the information specifying the amino acid sequences in the proteins. 
Ribosomal RNA (rRNA) molecules form part of the structure of the 
protein-synthesizing machinery of the cells, the ribosomes. Transfer 
RNA (tRNA) participates in protein synthesis (translation) by carrying 
amino acids to the ribosomes. 

The mRNA is of special importance. For each kind of protein chain 
the cell makes, there is a corresponding mRNA. The genetic coding 
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system that biochemists discovered between 1960 and 1965 showed 
that organisms use groups of three nucleotides, triplets, to code for 
amino acids. It is, therefore, the sequence of triplets in mRNA that 
directly determines the sequence of amino acids in the corresponding 
polypeptide chain. The coding group could not have consisted of only 
two nucleotides. That would have given only 4 2 = 16 combinations, and 
there are 20 amino acids that need coding. But 4 3 = 64 gives more than 
enough. Nature has chosen this most economical solution and has done 
it in a way so that 61 of these 64 triplets code for some amino acid. 
Almost all the amino acids, therefore, have more than one codon. 

When ribosomes directed by mRNA link together the amino acids, it 
is crucial that the ribosomes start at the correct position on the mRNA. 
Starting one nucleotide too early or late will give a completely different 
protein. A correct start is secured by having all protein synthesis start 
with the triplet AUG, which codes for methionine. This means that all 
proteins initially are synthesized with methionine as their first amino 
acid. In posttranslational processing, however, some proteins lose this 
initial methionine. As methionine also occurs internally in proteins, 
something more is needed for a correct start. In bacteria, the ribosomes 
recognize start AUGs by the presence of a short stretch of purines a 
short distance before the AUG. In animals and plants, the ribosomes 
choose the AUG that is closest to the 5' end of the mRNA. The coding 
triplets UAA, UAG, and UGA do not code for any amino acid, and these 
codons are used as termination signals. 

The amino acids do not by themselves have the ability to recognize 
their codons on mRNA. In addition, they must be activated in order to 
form peptide bonds with each other. Both of these problems are solved 
by each amino acid becoming bound to a tRNA with the help of a 
specific activating enzyme, using adenosine triphosphate (ATP). tRNA 
has in its structure an anticodon, a set of three nucleotides that form 
base pairs with the codon on mRNA for the corresponding amino acid. 
On the ribosome surface, the amino acids leave their tRNA carriers as 
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they are linked together to form polypeptide chains. Briefly, the infor¬ 
mation flow is as illustrated here for the start of a hypothetical gene: 

3' . . . TAC AAA AGT GAC TGT CCG CAT TAC.5' DNA 

5' . . . ATG TTT TCA CTG ACA GGC GTA ATG.3' 

transcription 

5' . . . AUG UUU UCA CUG ACA GGC GUA AUG.3' mRNA 

translation 

Met Phe Ser Leu Thr Gly Val Met.protein 


II. Gene Manipulation 


A. Basic Features 

The core of the new gene manipulation techniques is that one can 
join two pieces of DNA to form a recombinant DNA molecule, which 
then is multiplied, or cloned , in a suitable host organism to obtain 
sufficient amounts of the piece of DNA to study or use. One often uses 
as the one piece of DNA a plasmid, which is a ring-shaped, extra- 
chromosomal DNA molecule that occurs in many bacteria and carries 
genes, for instance, for antibiotic resistance. Figure 3 illustrates mo¬ 
lecular cloning of a piece of DNA with the help of a plasmid as vector. 
The plasmid vector is opened with the help of a special endonuclease, 
and the foreign DNA is joined in by the help of another enzyme. Under 
proper conditions, some of the bacteria will take in the recombinant 
DNA molecule, which will multiply inside while the bacteria also grow 
and multiply. Most of the bacteria have not taken up any plasmid. A 
selection is therefore necessary and is usually based on the plasmid 
carrying an antibiotic resistance gene. One can then isolate the plas¬ 
mid and the foreign DNA piece from the bacteria in such amounts that 
it is possible to study it using the electron microscope, determine its 
base sequence, inject it in fertilized ova, or use it as a diagnostic probe, 
for example. 

When one, as often is the case, does not have a homogeneous DNA 
preparation as starting material for the cloning, but rather has a mix¬ 
ture of many types of DNA pieces, special selection techniques are 
required afterward to find the interesting clone or clones. 

Plasmids are but one type of vector for DNA cloning. Viruses from 
bacteria or animals also are used frequently. 
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Fig. 3. Basic steps in cloning of DNA in a plasmid. 


B. Methods and Tools 
1. Nucleic Acid Hybridization 

Nucleic acid hybridization is fundamental in both basic and applied 
gene technology. This is the process by which sequence-specific 
DNA:DNA or RNA:DNA duplexes are formed from single-stranded nu¬ 
cleic acids. For most uses, one of the components in the hybridization is 
radioactively labeled, or labeled by other means, so that the hybrid can 
be identified. The labeled molecule is referred to as a probe. It is used to 
probe nucleic acid mixtures for the presence of sequences complemen¬ 
tary to that of the probe (Fig. 4). 

The factors affecting the stability of the hybrid and the rate of the 
hybridization reaction are fairly well known (Marmur and Doty, 1961; 
Nygaard and Hall, 1964; Young and Anderson, 1985). The stability 
depends on the temperature, the ionic strength of the medium (high 
ionic strength reduces the repulsion between the negatively charged 
phosphate groups and thus stabilizes the duplex), the presence of for- 
mamide (which breaks hydrogen bonds), the G + C content of the probe 
(G- and C-rich probes are bound more tightly because there are three 
hydrogen bonds linking G to C, compared to two bonds between A and 
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Sample DNA 
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Fig. 4. Hybridization. 


T), and the length of the probe. The temperature at which the duplex is 
50% denatured is called its T m . T m can be calculated from the formula: 

T m = 81.5°C + 16.6 log M + 0.41(%G+%C) - 500/n - 0.61(% 

formamide). 

This formula assumes perfect complementarity between the hybridiz¬ 
ing strands. With mismatches present, the T m is reduced by about 1°C 
per 1% mismatch. 

The factors affecting the stability also affect the rate of the hybrid¬ 
ization. The rate is optimal about 25°C below T m . For a reaction in 
solution, the rate is proportional to the concentrations of the reacting 
species. When the DNA or RNA to be investigated is immobilized on a 
membrane, which is often the case, the reaction is slower by a factor of 
7-10, but the general effects of the factors listed previously are the 
same. 

The reaction conditions are manipulated to give hybridizations at a 
desired level of stringency. High stringency—that is, high temperature 
or low ionic strength—will allow only the most stable hybrids to form, 
those having perfect complementarity. In certain situations, one may 
want to probe for a family of related DNA molecules with some se¬ 
quence differences. Hybridization at a lower stringency will then be 
chosen. The same may be necessary when using a probe from one 
species to look for a homologous gene in another species. 

Several methods are used to label a probe with radioactivity (Ar- 
rand, 1985). In our experience, the random primer method (Feinberg 
and Vogelstein, 1983) is a convenient way of obtaining high specific 
activity, better than the so-called nick translation. The probe DNA is 
denatured and both strands are used as a template for DNA synthesis, 
using short random oligonucleotides as primers, and one or more of the 
triphosphates being labeled with 32 P. 
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Nonradioactive labeling is also used because it has certain advan¬ 
tages for routine, diagnostic purposes. Attaching biotin to one of the 
bases (Leary et al., 1983) has received particular attention. The bio¬ 
tinylated probe is localized with streptavidin and a biotin-coupled en¬ 
zyme, for which a substrate that gives a colored precipitate is used. 
Nonradioactive probes have the drawback of multistep detection proce¬ 
dures and a somewhat lower detection sensitivity than do radioactive 
probes (Syvanen, 1986). 

2. Restriction Endonucleases 

Restriction endonucleases of type II, usually referred to as restric¬ 
tion enzymes , are a major tool in gene analysis and gene manipulation. 
These enzymes bind to particular sequences on DNA and cut both 
strands within this DNA site. The majority of the enzymes have sites 
of tetra-, penta-, hexa-, or heptanucleotides, which have an axis of 
rotational symmetry (Fig. 5a). Enzymes corresponding to over 150 dif¬ 
ferent sites have been discovered so far. The names of the enzymes 
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Fig. 5. (a) Specificities of some restriction enzymes. (From BIOCHEMISTRY, 2E by 
Lubert Stryer. Copyright © 1975, 1981. Reprinted with permission of W. H. Freeman and 
Company.) (b) Gel electrophoresis pattern showing SV40 DNA fragments produced with 
each of three restriction enzymes. (Courtesy of Dr. Jeffrey Sklar.) 




MOLECULAR GENETICS TO DIAGNOSIS, GENE THERAPY 


11 


have been constructed from names of the bacteria and the particular 
substrain from which they have been isolated. The arrows in Fig. 5a 
indicate where the DNA strands are cut by these 5 enzymes. Treat¬ 
ment of a DNA molecule with a restriction enzyme yields a reproduc¬ 
ible set of fragments. An enzyme with a recognition site of 4 nucleotide 
pairs will cut relatively often, statistically once every 4 4 (i.e., 256) 
nucleotide pairs, while an enzyme with a hexanucleotide recognition 
site will cut less often, once every 4 6 (i.e., 4096) nucleotide pairs. Fig¬ 
ure 5b shows the pattern of fragments into which three different re¬ 
striction enzymes cut the SV40 virus chromosome. After the enzyme 
treatment, the fragments were separated by electrophoresis in an 
agarose gel and made visible by binding of ethidium bromide, which 
fluorescences in ultraviolet light. The sizes of the fragments can be 
found by running molecular weight markers in the same gel. Short 
fragments move farther and long fragments a shorter distance. 

The fragments that the restriction enzymes produce will for most 
enzymes have single-stranded, complementary end sections, as the 
cuts are not made straight across (Fig. 5a). The fragments are said to 
have sticky ends. If a plasmid is opened with the same restriction 
enzyme that has been used to produce the DNA fragments to be cloned, 
this DNA will have sticky ends of a type that allows its insertion in the 
plasmid. Figure 6a shows what the situation would have been in Fig. 3 
if the enzyme EcoRl had been used. 

3. DNA Ligase 

To form a stable, recombinant DNA molecule from two molecules, 
the strands of these two molecules must be joined. This is effected by 
the enzyme DNA ligase. The enzyme requires for its action a 5'-phos¬ 
phate and a 3'-OH group. The enzyme from phage T4 uses ATP as 
energy donor, while the Escherichia coli enzyme uses nicotinamide- 
adenine dinucleotide (NAD). The ligation reaction is carried out at a 
low temperature, 4-15°C, in order to increase the stickiness of the 
sticky ends. 


4. Terminal Deoxynucleotidyltransferase 

Terminal deoxynucleotidyltransferase, or terminal transferase, is 
useful when the objective is to insert in a plasmid a DNA molecule that 
has not been produced with the help of a restriction enzyme, a fre¬ 
quently occurring situation. With this enzyme, tails of identical nucle¬ 
otides can be attached to the 3'-end of DNA strands. The plasmid is 
opened with a suitable restriction enzyme and furnished with a tail of 



12 


JENS G. HAUGE 





G-G-G-G-G — 
C-C-C-C-C 
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Fig. 6. (a) Fitting a piece of DNA into an EcoRI site of a plasmid, (b) Fitting a piece of 
DNA into a plasmid after the plasmid is tailed with Gs and the insert with Cs, using 
terminal transferase. 


about 20 G residues, while the DNA to be inserted is furnished with a 
C tail (Fig. 6b). 

5. Adaptors 

Using an adaptor DNA containing a restriction site is a frequent 
alternative strategy when a DNA molecule is to be inserted in a plas¬ 
mid opened with the corresponding restriction enzyme. A piece of 
DNA, 6-10 bp long, containing the desired restriction site is synthe¬ 
sized or purchased. The ends of the DNA to be inserted are first made 
blunt by filling in with DNA polymerase. All 5' ends are phosphory- 
lated with the enzyme polynucleotide kinase, and the adaptor is at¬ 
tached to both ends of the insert DNA with DNA ligase. Treatment 
with the restriction enzyme then produces the proper sticky ends for 
insertion in the plasmid. This method illustrates well the role of both 
chemistry and enzymology in recombinant DNA technology. 

C. Vectors 

Through stepwise modifications of natural E. coli plasmids, new 
plasmids that are safe and convenient for different types of gene ma¬ 
nipulation have been constructed. One much used plasmid is pBR322 
(Fig. 7a). This plasmid carries two antibiotic resistance genes, one for 
tetracycline and one for ampicillin. Within these two genes, there are 
several restriction sites for enzymes that have only these sites in the 
pBR322 molecule. If one uses the BaraHI site to insert a foreign DNA, 
the tetracycline resistance will be eliminated. It is then possible to 
select for bacteria that have taken up the plasmid by using ampicillin 
in the medium and then find those that have an insert in the BamHl 
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Fig. 7. (a) The cloning vector pBR322 (4362 bp). Amp: gene for ampicillin resistance. 
Tet: gene for tetracycline resistance, (b) The Southern blot technique, x indicates the gene 
or sequence of interest. (“An Introduction to Recombinant DNA,” Alan E. H. Emery, 
1984. Reprinted by permission of John Wiley & Sons, Ltd.) 
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seat by investigating which bacterial colonies have lost the tetracycline 
resistance- This is useful because there can be some regeneration of 
the original plasmid in the ligation step. 

Lambda virus DNA has also been modified so that it is suitable as a 
vector. Three of its five EcoRl sites have been removed. The DNA 
between the two remaining sites is not necessary for the infection 
process and can be exchanged with a foreign piece of DNA of 10-20 
kilobase (kb) pairs. The recombinant DNA is packed in vitro using 
lambda coat and tail proteins. These phage particles enter E . coli bac¬ 
teria very efficiently and reproduce as in a normal lytic infection. 

For other bacteria, for yeast, and for animal and plant cells there are 
other vectors. Shuttle vectors, vectors with specificity for two different 
hosts, have also been constructed. 


D. Studies of Gene Structure 


Comprehensive solutions to a wide range of biological problems re¬ 
quire knowledge of the structure of the gene and the regulatory se¬ 
quence that govern the particular biological phenomenon. 


1. cDNA Cloning 

For studies of genes of viruses and bacteria, cloning can start with 
fragments of their chromosomal DNA. With genes from animals and 
plants, the situation is different. Here the genome is so large and 
contains so many noncoding sections that one often must start on the 
basis on the mRNA molecules and carry out what is called a comple¬ 
mentary, or copy, DNA (cDNA) cloning. Total RNA is isolated with 
chemical methods from a tissue of interest, and mRNA adsorbed on a 
column of poly(T)-cellulose. mRNA is bound due to the 3' tail of 
poly(A), which mRNA possesses in higher organisms. mRNA is eluted 
and used as a template for RNA-directed DNA polymerase, so-called 
reverse transcriptase, an enzyme occurring in retroviruses, where its 
role under the viral infection is to produce a double-stranded DNA 
corresponding to the viral RNA genome. In principle, the same reac¬ 
tion now takes place in vitro . A mixture of cDNAs is formed, corre¬ 
sponding to the mixture of mRNAs isolated. 

The mixture of cDNA molecules can now, after attachment of sticky 
ends, be inserted in a plasmid. If the bacteria to be transformed have 
been made competent by a special treatment, one can expect to obtain 
10 5 -10 7 transformed bacteria per microgram of recombinant plasmid. 
The clones from these transformations make up a cDNA library for 
that particular tissue. 
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In this library are cDNA clones corresponding to the different 
mRNAs in the tissue. Clones for the particular gene one is interested 
in can be found by various means. The simplest situation is when one 
knows something about the amino acid sequence of the protein for 
which the gene codes. Then one can synthesize a DNA probe of 17—19 
bases, corresponding to a suitable hexapeptide. This probe is long 
enough that the search will be specific at high stringency. The probe 
preparation will have to be a mixture of DNA molecules, however, 
because most amino acids have more than one codon. An imprint of the 
bacterial colonies is made on a nitrocellulose filter, the cells are opened 
and their DNA is denatured. If one then uses a radioactive probe, it 
will hybridize to a few colonies, with those colonies containing cDNA 
corresponding to the gene of interest. Plasmid DNA is purified and 
treated with the proper restriction enzyme, and the cDNA is isolated 
from a hybridizing band on an agarose gel after electrophoresis. 

2. Gene Cloning 

A collection of clones that includes all DNA in an organism is called a 
genomic library . It is usually made in a lambda vector. DNA from the 
organism is subjected to incomplete digestion with a frequently cutting 
restriction enzyme, so that fragments of about 20 kb are formed. Vector 
arms are ligated to both sides of these fragments, and the resulting 
DNA molecules packed into lambda proteins, as described in Section 
II,C. The lambda clones that are the result of the infection will, if one 
has close to a million different clones, represent all the DNA and genes 
of a mammal. If one now has a cDNA clone for the gene one is inter¬ 
ested in, this cDNA can be radioactively labeled and used as a probe to 
find the lambda clones that contain the gene or parts of it. 

3. Splitting of Eukaryotic Genes 

One of the big surprises in molecular biology research was the dis¬ 
covery made in 1977 of cloned eukaryotic genes. The chicken gene for 
ovoalbumin was found to be 7700 bp long, while only 2000 bp would be 
required to code for the protein. Closer investigation of the cloned DNA 
with restriction enzymes and DNA sequencing revealed that the cod¬ 
ing part of the gene was divided into eight pieces, exons, separated by 
larger pieces of noncoding DNA, introns . 

4. Restriction Maps, Southern Blots, and DNA Sequencing 

Gene structure is characterized in broad outline by restriction map¬ 
ping and in detail by establishing the DNA sequence. A restriction map 
identifies the positions of sites for a set of restriction enzymes. The 
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correct order for the fragments that are produced by a given enzyme is 
determined in several ways, for example, through comparison of DNA 
fragment lengths corresponding to different degrees of partial diges¬ 
tion with the enzyme. The approximate distances between the cut sites 
are obtained by including molecular weight markers in the electro¬ 
phoresis. Accurate distances require DNA sequencing. 

If one has a cDNA probe for a gene, it is possible to obtain some 
information on the structure of the gene without having cloned it, 
namely, by using the technique developed by Southern (1979) (Fig. 7b). 
After digestion with a restriction enzyme and separation by agarose 
electrophoresis, the DNA fragments are denatured in alkali and trans¬ 
ferred to a nitrocellulose or nylon membrane with a flow of buffer. A 
replica of the gel pattern of DNA is thereby produced on the mem¬ 
brane. When the membrane is then treated with the radioactive probe, 
it will hybridize to those bands that have DNA sequences complemen¬ 
tary to the probe. This will show up on an X-ray film. In this way, one 
obtains information about restriction sites in the gene region corre¬ 
sponding to the cDNA. 

DNA sequencing is carried out with a chemical method developed by 
Maxam and Gilbert (1977) or with an enzymatic method developed by 
Sanger et al., (1979). In the chemical method, one starts with a cloned 
DNA fragment of a few hundred bases, which has the 5' end radioac- 
tively labeled. From this, a spectrum of radioactive fragments is pro¬ 
duced in which all sizes from the full length down to zero are repre¬ 
sented. In principle, this is done in four reactions, one reaction each for 
fragments ending with a G, a C, a T, or an A. The fragments are 
separated by gel electrophoresis at a voltage of about 3000 V, and an 
autoradiograph of the gel is made (Fig. 8a). The sequence can be direct¬ 
ly read from this autoradiograph, as indicated in Fig. 8a. The enzyma¬ 
tic method gives a similar end result. Here DNA polymerase synthe¬ 
sizes DNA in four reactions in the presence of small amounts of 2',3'- 
dideoxynucleoside triphosphates. These cause chain termination when 
incorporated. 

DNA-sequencing work has yielded a wealth of information on gene 
structure and chromosome organization in virus, bacteria, and eu¬ 
karyotic cells. Following are a few examples: Sequencing of the 5375 
base long <f>X174 virus showed how this DNA could code for proteins of 
2000 amino acids, by allowing an overlap between some of the genes. 
The transition between exons and introns was shown to have certain 
characteristic bases common to all genes. The promoter region in front 
of the gene, the site for attachment of RNA polymerase, was also found 
to have a common base pattern. Light has been thrown on the struc- 
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Fig. 8 . (a) Autoradiograph of a gel showing labelled fragments obtained in the Maxam 
and Gilbert DNA sequencing method. The reaction used for adenine affects both adenine 
and guanine, the reaction used for cytosine affects both cytosine and thymine. (From 
BIOCHEMISTRY, 2/E by Lubert Stryer. Copyright © 1975, 1981. Reprinted with per¬ 
mission of W. H. Freeman and Company.) (b) An RFLP caused by presence or absence of 
a site for restriction enzyme E. 


ture and the function of immunoglobulin genes, histocompatibility 
genes, and oncogenes. 

The sequencing technique has developed quickly. In 1982, the full 
sequence of lambda DNA with its 48,502 bp was published (Sanger et 
aL y 1982). Now underway is a gigantic project, an international cooper¬ 
ation to sequence the total human genome, with its 3 X 10 9 bp, before 
the year 2005. 
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5. The Polymerase Chain Reaction 

An important use of sequence data is in enzymatic amplification of 
segments of DNA in the polymerase chain reaction (PCR). PCR ampli¬ 
fication involves two oligonucleotide primers that flank the DNA seg¬ 
ment to be amplified and repeated cycles of heat denaturation, anneal¬ 
ing of the primers to the complementary sequences, and extension of 
the annealed primers with DNA polymerase. The primers hybridize to 
opposite strands of the target DNA and are oriented so that DNA is 
synthesized for the region between the primers. Because the extension 
products are also complementary and capable of binding the primers, 
each cycle doubles the amount of DNA resulting from the previous 
cycle. 

The basic idea of this method was described in 1971 by Kleppe et al., 
and later developed by Saiki et al. (1985). Its full potential appeared 
when Saiki et al. (1988) introduced the thermostable DNA polymerase 
from Thermus aquaticus (Taq ), thus eliminating the need to add fresh 
polymerase during each cycle. This modification not only simplified the 
procedure, making it amenable to automation, but also substantially 
increased its specificity, sensitivity, yield, and length of targets that 
could be amplified. Amplification of 10 million times was demon¬ 
strated, segments up to 2000 bp were readily amplified, and longer 
segments were amplified with reduced yield. 

The PCR technique has found wide application (Erlich, 1989; Erlich 
et al ., 1991a). Isolation of cloned segments from their vectors is sim¬ 
plified by PCR amplification of these, using vector-specific primers that 
flank the insertion site. RNA and cDNA can be amplified from the 
products of reverse transcription (Todd et al., 1987). DNA sequencing 
with the enzymatic method can be done directly on the PCR product 
without ligation into a sequencing vector. For the latter purpose, it is 
advantageous, however, to purify the sequencing template by incor¬ 
porating biotin in one of the primers. After immobilization of the PCR 
product on streptavidin-coated magnetic beads, the unbiotinylated 
DNA strand is removed by alkali (Hultman et al., 1989). In diagnostic 
work and genetic mapping, PCR methods are used to a large extent, as 
will be apparent in later sections. For some applications, it is impor¬ 
tant to be aware that the Taq enzyme allows incorporation of one 
incorrect nucleotide for about 20,000 nucleotides incorporated. 


III. Studies of Metabolic and Gene Regulation 

Students of metabolic regulation have now new tools at their dispos¬ 
al through the advent of recombinant DNA technology. The area where 
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the impact has been the greatest is in the analysis of gene expression 
and its regulation by hormones. This is due to the availability of cDNA 
probes for a broad variety of genes of metabolic interest. These probes 
make possible quantitation of the level of specific messengers. A num¬ 
ber of such studies, covering enzymes in glycolysis, gluconeogenesis, 
lipid metabolism and amino acid metabolism, are described by Good- 
ridge and Hanson (1986). 


A. Tissue Culture Studies 

Several of the corresponding genes have been isolated and their 
promoter-regulator regions are being identified and characterized. 
One example is the rat gene for cytosolic phosphoenolpyruvate car- 
boxykinase (PEPCK), studied by Hod et al. (1986). The promoter-regu¬ 
lator region had been localized to within a 620-bp BamHl/Bglll frag¬ 
ment present at the 5' end of the gene. The hormonal control region 
was studied by linking the 621-bp fragment to the structural segment 
of the herpes simplex thymidine kinase (TK) gene in a plasmid vector. 
This construct was used to transfect cells of a TK-deficient rat hepatom 
cell line. The addition of dibutyrylcyclicAMP (cAMP) to the cells re¬ 
sulted in four- to sixfold induction of both TK activity and the level of 
its mRNA. To define the sequences in the PEPCK promoter that are 
necessary for this cAMP effect, a series of graded deletions was con¬ 
structed. The transfected chimeric genes retained their responsiveness 
to cAMP as long as they included the sequence from -61 to -108. 
Deletion of this 47-bp fragment completely eliminated cAMP in- 
ducibility. Further studies of this and other cAMP-regulated genes 
have shown that a palindromic DNA sequence, TGACGTCA, termed 
CRE (cAMP response element) is sufficient for the effect. Other hor¬ 
mones, such as glucocorticoids, androgens, mineralocorticoids, estro¬ 
gen, thyroxine, and retinoic acid have specific response segments 
where the hormone-receptor complexes bind (Lucas and Granner, 
1992). cAMP normally exerts its effect through activation of protein 
kinase A (PKA). This is the case also here. A CRE-binding protein 
(CREB) that is phosphorylated by PKA, and thereby activated, was 
discovered (Yamamoto et al., 1988). 

PEPCK is actually regulated by several hormones and the dietary 
state. Further transfection studies have uncovered eight protein-bind¬ 
ing domains between —460 and +73 and identified proteins binding to 
these response elements (Park et al., 1993a). The interaction of these, 
partially tissue specific transcription factors, with the basic transcrip¬ 
tional complex is currently being studied. 



20 


JENS G. HAUGE 


B. Regulation in Transgenic Animals 

While much may be learned from studies with cell cultures and how 
they react when recombinant DNA is introduced, this approach has its 
limitations. One fundamental set of regulatory problems concerns the 
development of an animal from the fertilized ovum. What sorts of 
mechanisms are responsible for tissue- and time-specific transcription 
of genes? Such questions are addressed through the use of transgenic 
organisms, particularly mice (Westphal, 1987; Grosveld and Kollias, 
1992). This important technique was introduced by Gordon et al. 
(1980). The DNA to be investigated has mostly been injected into the 
male pronucleus of a fertilized ovum, and the ovum implanted. Some of 
the ova develop into mice with the extra DNA integrated in their chro¬ 
mosomes, usually in many copies arranged head to tail. 

The first successful experiments dealing with tissue-specific regula¬ 
tion were reported by Chada et al. (1985). They showed that a human 
(3-globin gene, in which the front end had been replaced by the corre¬ 
sponding mouse gene, together with 1200 bp flanking DNA, was ex¬ 
pressed in transgenic mice exclusively in erythroid cells. They also 
showed that the hybrid (3-globin gene was expressed at the proper time 
during development (Magram et al., 1985). The 5'-flanking region 
must have been responsible for this specificity, along with correspond¬ 
ing tissue-specific regulatory proteins. 

Similar results have been found for a number of other genes and 
tissues (Palmiter and Brinster, 1986). A particularly striking example 
is the work on elastase I (Swift et al., 1984). Mice were made trans¬ 
genic with the rat elastase I gene and its flanking regions. A marked 
tissue specificity was observed, rat elastase mRNA was 500,000 times 
more abundant in the pancreas than in the kidneys. Experiments with 
trimmed down 5'-flanking DNA showed that 205 bp was sufficient to 
uphold the pancreas-specific expression (Ornitz et al., 1985). When this 
regulatory region was joined to the human growth hormone gene, hu¬ 
man growth hormone was found in the pancreas and not in other 
tissues. Immunofluorescence analysis, furthermore, demonstrated 
that the growth hormone was present in the acinar cells, not in the 
endocrine or connective tissue cells of the pancreas. 

The faithful tissue specificity observed with these kinds of constructs 
has been exploited in the study of oncogene expression (see Section 
IV,E). In one such study, Hanahan (1985) directed the expression of the 
SV40 oncogene to pancreatic (3-cells, using an insulin gene regulator 
sequence. 

This technique has also been used, for instance, to further under- 
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stand the complex and tissue-specific regulation of the PEPCK gene. 
McGrane et al. (1988) fused the promoter-regulatory sequence of the 
PEPCK gene to the bovine growth hormone gene as a reporter gene. 
The transgene was expressed only in liver and kidney and showed 
dietary and hormonal responsiveness. Further transgenic mice that 
carried mutations in specific regulatory domains were produced to es¬ 
tablish their regulatory roles (Patel et al ., 1994). 

Another method for producing transgenic animals uses embryonic 
stem (ES) cells. These are derived from an early embryo, cultivated, 
and infected with the retrovirus construct (Robertson et al., 1986) or 
transfected through the phosphate-DNA precipitation technique 
(Lovell-Badge et al., 1985). When they are reintroduced into the blas¬ 
tocyst, they contribute to the production of a chimeric animal. In the 
next generation, some pure transgenic animals appear. This approach 
has the advantage that somatic cell genetic techniques can be used to 
modify and to select cells with a desired potential. 


IV. Diagnosis of Genetic Disease 

A. Gene Changes in Hereditary Disease 

There are many ways in which mutations can affect the production 
of a given gene product. Studies of hemoglobin synthesis in humans 
have found examples of most of these possibilities. Replacement of a 
single base with another base can have the effect that an amino acid is 
replaced by another amino acid. There are 189 such structural vari¬ 
ants known for the human (3-globin chain (Little, 1981). Not all of these 
cause disease, however. The mechanism of polypeptide chain termina¬ 
tion is the source of other disturbances. In individuals with hemoglobin 
Constant Spring, the stop codon TAA for a-globin is replaced by CAA. 
In a type of (3°-thalassemia (no production of 3-chains), the seventh 
codon, AAG, is replaced by TAG, terminating the chain prematurely. 

Single-base mutations may also affect transcription and RNA pro¬ 
cessing. In 3-thalassemia (low production of 3-chains), this is often 
observed. Some mutations in the promoter region reduce transcription 
markedly. Changes at the exon-intron junctions lead to errors in the 
removal of introns from the primary transcript, with unusable mRNA 
as the result. Mutations can also generate new, false exon-intron bor¬ 
der sequences, again with defective mRNA as the result. 

Deletions and insertions are other common types of mutation. Dele¬ 
tions or insertions of 1, 2, or 4 bases are the cause of several 
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3°-thalassemias. Such a mutation destroys the reading frame com¬ 
pletely because the wrong set of triplets is read by the ribosomes. 
Deletions of n X 3 bases can also occur, but the reading frame is here 
not affected. Mutants with 3-chain shortening from 1 to 5 amino acids 
are known. Larger deletions do also occur; for hemoglobin genes, dele¬ 
tions from 0.6 kb to about 20 kb are known. 

Several studies have uncovered a similar mutational heterogeneity 
for many other genes. In addition, a new type of mutation has been 
discovered in myotonic dystrophy (DM) and six other diseases (Miawa, 
1994). The DM gene contains the triplet repetition (GCT) n , with n 
varying somewhat in the population. For affected individuals, n is 
increased, and n appears to correlate with the severity of the disease. 

B. Detection of Gene Changes with DNA Probes 

Recombinant DNA technology makes possible a direct approach to 
the studying and diagnosis of a genetic disease, instead of a study of 
the phenotypic expression of the disease. For animal owners and ani¬ 
mal breeders, it will be valuable to be able to detect carriers of reces¬ 
sive disease genes in order to avoid propagating the disease gene or 
mating the animal with another carrier of the same defect. Further¬ 
more, the homozygous state of a recessive gene, or the heterozygous 
state of a dominant gene, is not always expressed phenotypically. The 
disease may depend on an environmental factor for its expression or 
have a late onset. Here a DNA analysis could give an early diagnosis. 

Use of recombinant DNA methods in diagnosis and gene defect stud¬ 
ies have made possible great advances in the human sector and is well 
underway in the veterinary sector also, as this volume demonstrates. 
Common to the methods is the requirement for having cloned or syn¬ 
thetic DNA probes for the disease gene or its neighborhood or sequence 
knowledge, permitting the use of PCR methods. 

L Mutations that Create or Destroy a Restriction Site 

The ideal situation exists when the mutation either creates a new 
restriction enzyme site or removes one that was present in the wild 
type. This is the case for sickle cell anemia in humans. In sickle cell 
anemia, the sixth amino acid in the 3-chain of hemoglobin, glutamate, 
has been replaced by valin because an A in the corresponding codon 
has been replaced by T. The restriction enzyme Ddel for normal indi¬ 
viduals cuts three places in the 3-globin gene, whenever the sequence 
CTNAG occurs (N can be any base). The mutation has the effect that 
the second of these sites is destroyed. Sickle cell patients thus will have 
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one longer DNA fragment hybridizing with a p-globin probe, while 
normal persons will have two shorter fragments. This is revealed by 
the method of Southern blotting, described in Section II,D,4. A hetero¬ 
zygote would be revealed by all three bands being present. 

Each restriction digest requires 5“ 10 pg of patient DNA. This is 
usually obtained from the buffy coat of peripheral blood samples. The 
leukocytes present in 10-20 ml of human blood yield up to 200 jxg of 
DNA after treatment with proteinase K and extractions with phenol 
and chloroform. 

There are many other examples of diagnoses based on change in a 
restriction site; e.g., in the ras family of oncogenes. This is also the case 
for Ha-ms induced mutations in the rat (Zarbl et al., 1985). A more 
convenient version of this test amplifies the region around the muta¬ 
tion with PCR and subjects the PCR product to restriction enzyme 
treatment. Sufficient DNA is often present to allow direct detection of 
the restriction fragments after electrophoresis, using ethidium bro¬ 
mide and ultraviolet light. 

2. Deletion and Insertion Mutations 

When deletions or insertions of a few hundred or more base pairs are 
the cause of the disease, a diagnosis is often easier. These mutations 
will directly affect the length of the DNA fragment between two given 
restriction sites. Deletions occur in the human globin genes, the gene 
causing Duchenne type muscular dystrophy, the gene for coagulation 
factor VIII, and others. 

3. Single-Base Replacement in a Known DNA Sequence 

Base replacement mutations that do not affect any restriction site 
may still be diagnosed if one knows the DNA sequence around the 
mutation site. One can then synthesize a wild-type allele-specific oli¬ 
gonucleotide (ASO) of 18-20 bases, label it with radioactivity, and 
study how it hybridizes to DNA fragments on a membrane. Under 
stringent hybridization conditions, it is possible to detect a mismatch 
of only one base between the probe and the DNA investigated. It is 
prudent to also make a probe corresponding to the mutant and demon¬ 
strate its reduced hybridization to normal DNA. This method is used in 
diagnosis of o^-antitrypsin deficiency (Kidd et al., 1984), Ha-ms muta¬ 
tions in rats (Zarbl, 1985), and several other mutations. 

The ASO method is more reliable and convenient when carried out 
on a PCR-amplified segment containing the mutation. The sensitivity 
thus reached makes possible the use of nonradioactive, enzyme-labeled 
ASO probes in dot blot hybridization (Scharf et al., 1991). The occur- 
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rence of many different mutations in the same gene in the population 
can be analyzed using a reverse dot blot method. Instead of having 
numerous membranes with the PCR product applied, each to hybridize 
with a different ASO probe, a membrane that contains an immobilized 
array of ASO probes is hybridized to the PCR product (Erlich et al. y 
1991b). 

4. Restriction Fragment Length Polymorphisms (RFLPs) 

Linked to the Gene of Interest 

Earlier, the most widely used and general diagnostic method was 
that of RFLPs linked to the gene that was being investigated. With this 
technique, one needs no knowledge of the structure of the gene. It is 
based on the fact that the heterozygosity at the level of DNA is very 
large. If the base sequences of two homologous chromosomes are com¬ 
pared, one finds in the human a difference for each 200 to 400 bp 
(Cooper and Schmidtke, 1984). The majority of these differences are 
neutral mutations. A few of them reside in the 3. base of coding trip¬ 
lets, most of them in introns and in the region between the genes. 

Some of these mutations will affect known restriction enzymes and 
will be within or close to the gene of interest, so that they can be 
observed as RFLPs with a probe from the same region (Fig 8b). Such 
an RFLP can be used as a marker for the gene studied. The procedure 
is essentially the same as shown in Fig. 7b. Presence and absence of 
the restriction site will give two different restriction fragment lengths, 
moving different distances on the gel. 

a. Study of Linked RFLPs in Families. The use of RFLPs for diag¬ 
nosis has one drawback: One does not arrive at an answer by investi¬ 
gating the DNA of a single individual, as was possible with the first 
three methods described in this section. The chromosome with the 
mutant form of the gene will not always have the same restriction site 
allele, for instance, presence of the restriction site. That may have been 
the original constellation, but only a fraction of chromosomes with this 
restriction site need have the gene mutation. By recombination, the 
mutant gene may also have landed on a chromosome with this restric¬ 
tion site absent. If the distance between the restriction site concerned 
and the gene studied is small, it is probable that the constellation will 
remain the same within a given family. 

The use of this method is illustrated in the following example from a 
family with (3-thalassemia. Both parents were healthy, but a child had 
been affected. During the three subsequent pregnancies, prenatal di¬ 
agnosis was carried out. Using the restriction enzyme Avail, Southern 
blotting, and a labeled (3-globin probe, the following band pattern was 
observed (Fig. 9). 
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Fig. 9. Prenatal diagnosis of (5-thalassemia. (Reproduced with permission, from the 
Annual Review of Medicine, Volume 37, 1986, by Annual Reviews, Inc.) 


Both parents were RFLP + -, one chromosome having the restric¬ 
tion site and the other not. The affected child was + -K The + chromo¬ 
some from both parents must have carried the disease gene. One could 
conclude from this that the fetuses were noncarrier, affected, and carri¬ 
er, respectively. The probe cross-hybridizes with 8-globin sequences 
common to all, 

b. Usefulness of Having More than One RFLP ! The example just 
described was a favorable situation. It would have been possible to 
diagnose only 50% of the offspring correctly if the thalassemia gene in 
one parent had been on an RFLP - chromosome. The situation might 
have improved if another RFLP had been close by. With two RFLPs, 
one would have four combinations, so-called haplotypes. This is equiv¬ 
alent to having four alleles for the marker locus, which gives a favor¬ 
able degree of heterozygosity. 

If the distance between the gene and the RFLP site is large—e.g., 
some thousand kb—it is important to have a second RFLP situated on 
the opposite side of the gene. A crossing over during meiosis between 
the first RFLP and the gene would then be detected because a double 
crossing over is very unlikely. Such a flanking RFLP is also useful in 
work attempting to localize and study the gene if the gene is not yet 
characterized. 

c. Searching for RFLPs. The search for RFLPs does not have to be 
a random trial and error testing of the large battery of commercially 
available enzymes. Cooper and Schmidtke (1984), Wijsman (1984), and 
Feder et al. (1985) describe rational approaches and give lists of rela¬ 
tive efficiency for many enzymes. Enzymes containing the dinucleotide 
CG in their recognition sequences detect more variation of the base 
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replacement type than other enzymes, probably because of mutations 
from methylated CGs to TGs. Larger probes are found more efficient 
than small probes. 

The screening should be limited, as suggested by Skolnick and White 
(1982), to a panel of about 10 unrelated individuals to avoid picking up 
rare polymorphisms, which have low degrees of heterozygosity and, 
therefore, are inferior in diagnostic work. 

When multiple restriction fragments are found, it must be shown 
that they represent alleles at the same locus. A probe might recognize 
nonallelic fragments due to sequence homology with a different chro¬ 
mosomal region. If the fragments are alleles, they will segregate in a 
Mendelian fashion. 

d. Minisatellites . Studies of gene structure have uncovered situa¬ 
tions where a single restriction enzyme reveals a multiallele polymor¬ 
phism, a polymorphism that does not affect sites for the enzyme. Be¬ 
tween two fixed restriction sites, one finds a region consisting of 
tandem repetition of 15-100 bp units, and the number of these repeats 
varies considerably in the population. Such variable tandem repeat 
(VNTR) regions, also called minisatellites, have been found in humans 
close to the insulin gene (Bell et al., 1982), the a-related globin genes 
(Proudfoot et al., 1982), and the c-Ha-ras oncogene (Capon et aL, 1983), 
but similar VNTR regions appear widely dispersed (Jeffreys et al., 
1985a). The presence near a disease gene of a VNTR region is very 
useful, as the multiallelic nature of the corresponding RFLPs leads to 
high degrees of heterozygosity A method for a systematic search for 
such RFLPs has been reported (Nakamura et al., 1987). 

e. Microsatellites. Weber and May (1989) reported the existence of 
a large set of highly polymorphic microsatellites, consisting of di¬ 
nucleotide repeats, that could be typed using PCR. Such micro¬ 
satellites have become the most important source of markers for high- 
resolution genetic maps. A human map with 814 microsatellites was 
presented in 1992 (Weissenbach et al., 1992). Two years later, the map 
included 2066 (AC) n markers, flanked by unique sequences for which 
PCR primers could be synthesized (Gyapay et al., 1994). The average 
distance between the markers was 2.9 cM. This means that there are 
now polymorphic microsatellite markers close enough to most disease 
genes for diagnostic purposes, forming at the same time starting points 
for work to identify the genes and their mutations. The latter task is 
being made easier by the parallel development of physical maps, con¬ 
sisting of contigs of yeast artificial chromosome (YAC) clones (Cohen et 
al., 1993), ordered using sequence-tagged sites (STSs) (Olson et al., 
1989). 
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5. Disease Gene Identification 

Rapid progress has been made in the identification and characteri¬ 
zation of disease genes in humans, a prerequisite for application of 
the direct methods of diagnosis, described in Sections IV,B, 1-3. Many 
genes have been identified by functional cloning , using information 
about the protein product (sequence or antibodies). Mapping of the 
gene then followed cloning. Starting in 1986, another approach, posi¬ 
tional cloning , came into use. Here, mapping to a definite location on 
a chromosome is the basis for the cloning effort. Some 20 disease 
genes have been identified in this way (Collins, 1992). As the human 
genome project produces high-resolution genetic maps and physical 
maps based on overlapping clones, positional cloning will be sim¬ 
plified. 

In other instances, a map position has been important in order to 
verify a gene identification, but the availability of a plausible candi¬ 
date protein and its cDNA has made it possible to avoid exhaustive 
cloning in the specified chromosome region. This positional candidate 
approach will probably become increasingly common (Ballabio, 1993). 
In all, there are now about 400 human disease genes that can be 
diagnosed directly. 


6'. Genetic Screening 

Carrier screening programs have been attempted for some human 
genetic diseases even before DNA tests became available, notably sick¬ 
le cell anemia, thalassemia, and Tay-Sach’s disease. The thalassemia 
program in Cyprus, testing couples at marriage, has been very success¬ 
ful. Pilot programs testing pregnant women for the cystic fibrosis gene 
and the partner if the test was positive have taken place in Britain and 
the United States (Williamson, 1993). Although there are legitimate 
concerns that screening data may be used to coerce families to make 
reproductive decisions that meet economic or societal objectives rather 
than personal wishes, the availability of a screening service is gener¬ 
ally welcomed. 

For human diseases, however, there is often the problem of genetic 
heterogeneity. The number of different cystic fibrosis mutations is now 
greater than 260. Although most of these are rare—11 mutations mak¬ 
ing up 91.5% of the mutations in Northwest England (Ferec et al., 
1992)—a screening program for cystic fibrosis will clearly leave a small 
residual risk for unexpected development of the disease. For animal 
genetic diseases, there is less heterogeneity, making screening an ef¬ 
fective and widespread tool (see Section IV,D). 
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C. DNA Probes and Disease Susceptibility 

Recombinant DNA technology also has made it possible to begin to 
analyze polygenic diseases, i.e., diseases in which a combination of 
unfavorable genes may predispose to the development of disease. 

1. HLA Genes 

Genes determining susceptibility or resistance to a broad variety of 
diseases in humans have been localized to the HLA region, the major 
histocompatibility complex in humans (Vogel and Motulsky, 1986; Ko- 
styu, 1991). Among these diseases are ankylosing spondylitis, celiac 
disease, dermatitis herpatiformis, and insulin-dependent diabetes 
mellitus (IDDM). The latter three are associated with the HLA se¬ 
rologic specifity DR3, IDDM also with DR4. Seventy-five percent of 
IDDM patients have HLA-DR4, while only 32% of controls do. This 
gives a DR4 individual a relative IDDM risk of 6.4. The risk is in¬ 
creased in DR3,4 heterozygotes (Knip et al, 1986). The association of 
HLA-DR genes with IDDM susceptibility is related to the fact that 
IDDM pathogenesis involves autoimmune phenomena (Nepom and 
Erlich, 1991). 

With recombinant DNA technology, it became possible to answer the 
following question: Are there DNA differences, detectable as restric¬ 
tion fragment length differences, between patients and controls of the 
same DR3 or DR4 specificity? This question was indeed answered 
when Owerbach et al. (1983), using a DQ[3 probe, found a significantly 
decreased frequency of a BamHl 3.7-kb fragment among DR4 IDDM 
patients. These studies have been confirmed and extended by others, 
including Michelsen and Lernmark (1987). 

The full impact of DNA technology, however, came with PCR ampli¬ 
fication of cDNA from the first exons of a and (3 chains, the exons 
containing most of the polymorphisms, followed by sequencing (Todd et 
al., 1987). This procedure has identified a large number of a- and 
(3-chain alleles, denoted by a four-digit code. Typing of DNA sequence 
polymorphisms at HLA loci is conveniently done by PCR amplification 
of genomic DNA with appropriate primers, followed by probing with 
sequence-specific oligonucleotide probes (Erlich et al ., 1991b). Such 
studies have shown that molecules of DQ(cxl*0301,(31*0302), associ¬ 
ated with DR4, and DQ(al*0501,(31*0201), associated with DR3 give a 
relative IDDM risk of 6-18 for Caucasians. In heterozygotes, two fur¬ 
ther DQ molecules may form by trans combination of the subunits 
(Thorsby and Rpnningen, 1993; Nepom and Erlich, 1991). Results of 
this nature have also been reported for malaria, where DRB 1*1302— 
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DQB 1*0501 gives some protection (Hill et aL, 1991), and for pa¬ 
pillomavirus-associated cervical carcinoma, where DRB 1*1501- 
DQB1*0602 gave increased risk (relative risk 4.8) (Apple et aL, 1994). 

2. Insulin Gene 5'-VNTR Alleles 

As mentioned in Section IV,B,4,c, there is a VNTR associated with 
the human insulin gene on chromosome 11. The variable region falls in 
three size classes (class I: average 570 bp, class II: average 1200 bp, 
and class III: average 2200 bp). A strikingly higher frequency of class I 
alleles was observed in Caucasians with IDDM (Bell et aL, 1984). Lu- 
cassen et aL (1993) have, by sequencing the insulin gene locus from 
patients and controls, identified the region of association to IDDM to 
be 4.1 kb. Ten polymorphisms, including the VNTR, are in strong link¬ 
age disequilibrium with each other, showing a relative risk for IDDM 
of 3.8-4.5. Analysis of other ethnic groups may reveal exactly which 
polymorphism(s) confer susceptibility. It is possible that the VNTR 
length directly affects transcription. 

The insulin gene associated polymorphism and the HLA variation 
are not sufficient to account for the development of IDDM. Extensive 
mapping of the disease in the NOD (nonobese diabetic) mouse, using 
microsatellite markers, has revealed eight additional loci that make a 
contribution to this polygenic disease (Ghosh et aL, 1993). One of these 
codes for interleukin-2, which has a different sequence in NOD mice 
and normal mice. Interleukin-2 has a role in autoimmunity. 

3. Atherosclerosis 

The most plausible link between genes and atherosclerosis is found 
in the structure and regulation of genes for lipoproteins, with enzymes 
taking part in their metabolism, and lipoprotein receptors. Genes for 
the apolipoproteins and the low-density lipoprotein (LDL) receptor in 
humans have been cloned and localized on chromosomes. One of every 
500 persons is heterozygous for a mutant allele of the LDL receptor, 
leading to 2-3 times normal levels of cholesterol. Hobbs et aL (1992) 
have unraveled the mechanisms involved and characterized different 
types of mutations in the gene. Mapping of the defects in a population 
can start with a determination of mutation-associated RFLP haplo- 
types. This information is useful for diagnosis within families 
(Rpdningen et aL, 1992). In order to establish direct gene tests, DNA 
from affected individuals was screened for mutations (Leren et al., 
1993). 

LDL is bound to its receptor via apoB-100, and hypercholesterolemia 
may be caused by mutations in the apoB-100 gene as well (Innerarity 
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et aL , 1990). Mutations are known also in the apoA-I gene. These give 
low levels of high-density lipoprotein (HDL), accompanied by ath¬ 
erosclerosis. Type III hyperlipoproteinaemia patients have a particular 
allele for the apoE gene. With the current knowledge of DNA structure, 
oligonucleotide probes that differentiate between normal and mutant 
genes can be made, allowing screening of people at risk. 

D. Application to Domestic Animals 

The application of recombinant DNA techniques to diagnosis in do¬ 
mestic animals is now quite widespread. The availability of DNA 
clones, or knowledge of primer sequences, from humans and small 
experimental animals has speeded this development. 

1. Diagnosis of Disease Mutations 

The first disease to benefit from a DNA test was citrullinemia, which 
is widespread in Australian Friesian cattle. Three conditions were 
helpful for the establishment of a DNA test for the mutation in 1989: 
bovine cDNA libraries were available, sequenced cDNAs for the en¬ 
zyme for humans and rats were available, and the protein is fairly 
small, only 412 amino acids. 

Normal bovine cDNA for the enzyme was isolated with a rat probe 
and sequenced by Dennis et al. (1989). In order to identify the muta¬ 
tion, total mRNA was prepared from the liver of an affected animal 
and cDNA synthesized. The product was amplified with PCR, using 
primers corresponding to sequences immediately before and after the 
coding region of the cDNA. These primers had been extended with 
BamHl linkers, so that the amplification product could be ligated into 
the BamHl site of the DNA sequencing vector. In clones from the 
mutant, a C to T transition in the first position of codon 86 was found. 
This generates a TGA termination codon, which leads to a truncated 
and inactive protein. 

The change in codon 86 leads to the disappearance of an Avail re¬ 
striction site. This was used by Dennis et al. to design a simple, PCR- 
based gene test on genomic DNA. A 194-bp DNA could be amplified 
from within the exon containing the mutation. A portion of the product 
was digested with Avail, and both digested and undigested DNA an¬ 
alyzed by electrophoresis. The normal calf then gives two bands of 72 
and 118 bp, the homozygously affected calf one band of 194 bp, and a 
carrier all three bands. 

Work with malignant hyperthermia (MH) in pigs had started earlier. 
We isolated clones for the glucosephosphate isomerase (GPI) gene, a 
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closely linked gene (Davies et al 1987). With such DNA as probe, a 
multiallelic RFLP was detected and was confirmed to be tightly linked 
to the MH gene (Davies et al., 1988), so it could serve as a marker. The 
further search for the MH mutation was based on reports that calcium 
is more easily released from the sarcoplasmic reticulum of MH pigs. 
Meissner (1986) had shown that the alkaloid ryanodin interacts direct¬ 
ly with the calcium release channel (CRC) of rabbits. Using tritiated 
ryanodin as a label, Lai et al. (1988) purified the channel. This work 
prepared the way for Takeshima et al. (1989) to clone the rabbit CRC 
cDNA, a 15,100-bp-long cDNA. With a small rabbit cDNA clone as a 
probe, we isolated a porcine cDNA clone and showed it to hybridize in 
situ to metaphase chromosomes in the same chromosomal band as GPI 
DNA hybridized (Harbitz et al., 1990). This confirmed CRC as a good 
candidate for the product of the MH locus. Sequencing CRC cDNA 
from a normal and an affected pig, Fujii et al. (1991) identified the 
mutation, a substitution of T for C at nucleotide 1843, changing an 
arginine to cysteine. The mutation was found to be the same in five 
Canadian pig breeds, in British Landrace (Otsu et al., 1991), and in 
Norwegian Landrace (Harbitz et al., 1992). 

The mutation destroys a HinP site and creates a HgiAl site. Fujii et 
al. amplified a 74-bp piece from the exon containing the mutation and 
made the restriction site changes in the product the basis for their 
diagnosis. A more convenient and reliable method became possible 
after sequencing of genomic DNA flanking the exon. 

The last years have added three more diseases to the list: leukocyte 
adhesion deficiency (LAD) (Schuster et al., 1992) and uridine mono¬ 
phosphate synthase (DUMPS) deficiency (Schwenger et al., 1993) in 
cattle, and hyperkalemic periodic paralysis (HYPP) in quarter horses 
(Rudolph et al., 1992). In the study by Rudolph et al., the mutated 
region was found using single-strand conformation polymorphism 
(SSCP) analysis on PCR-amplified sections of the cDNA (Orita et al., 
1989). This is a good general method for detecting existing two-allele 
polymorphic markers (Neibergs et al., 1993). 

2. Diagnosis of Disease Susceptibility 

Association between diseases and alleles of genes in the major histo¬ 
compatibility complex (MHC) is also known for domestic animals. This 
is particularly well documented for chicken with Marek’s disease, 
caused by a herpesvirus (Hanson et al., 1967; Hepkema et al., 1993). 
Resistance to bovine leukosis, caused by a retrovirus, also appears to 
have an MHC association (Lewin and Bernoco, 1986), and Mejdell et al. 
(1994) found an influence of the bovine MHC on resistance to mas- 
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titis. Park et al. (1993b) found a highly significant association between 
the bovine MHC class I antigen B 0 LA-A 8 and chronic posterior spinal 
paresis—a form of ankylosing spondylitis—in Holstein bulls. The rela¬ 
tive risk was 34.6. The special interest in developing RFLPs to estab¬ 
lish haplotypes for the MHC genes in domestic animals must be viewed 
in this light. MHC RFLPs have been found with human probes in a 
large number of species (Andersson et al., 1986; Juul-Madsen et al., 
1993). Additional studies are now being carried out based on sequenc¬ 
ing of the MHC polymorphic exons. The data have been presented for 
cattle (Andersson et al., 1991), sheep (Fabb et al., 1993), horses (Szalai 
et al., 1993), and pigs (V&ge et al., 1994). 


E. Gene Changes in Cancer 

Cancer, like genetic diseases, is a result of changes in the hereditary 
material, but in cancer, the changes are mainly in somatic cells. Much 
new information on the phenomenon of neoplastic growth has been 
provided with the help of recombinant DNA techniques. The molecular 
structures of a long series of viral oncogenes and cellular proto-on¬ 
cogenes, often coding for proteins participating in signal transduction 
from the cell surface to the nucleus, have been elucidated, and the 
mechanisms for the activation of proto-oncogenes to oncogenes to a 
large extent clarified (Land et al., 1983; Van de Woude et al., 1984). For 
the ras family of oncogenes, this activation takes place by a base re¬ 
placement, and for others—e.g., myc, myb, and ets-1 —by translocation 
or amplification. Oncogene probes are available for such studies. 
Translocations, and sometimes mutations, result in restriction frag¬ 
ment size changes. Amplification leads to stronger bands of hybridiz¬ 
ing DNA. Amplification, or stimulated transcription, leads to increased 
hybridization to mRNA. 

These developments have clinical relevance. DNA methods are be¬ 
coming useful in tumor diagnosis, and they give new information on 
the stage of development of the tumor. Oncogenes are highly conserved 
in evolution, so that it is likely that human and murine oncogene 
probes can be used to investigate cancer development in domestic ani¬ 
mals. Human probes for c-myc, myb, HER-2 / neu, H-ras, K-ras, and 
N-ras thus were found by hybridize to DNA fragments from dogs 
(Hauge et al., 1988). 

While activated oncogenes act in a dominant manner, mutations in 
tumor suppressor genes are recessive. Their gene products inhibit cell 
division and thus balance the stimulatory properties of the proto-on- 
cogene products. Loss of function of the suppressor gene on both chro- 
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mosomes leads to increased cell division. Because the first mutation is 
often transmitted from a parent, while the second arises somatically, 
such tumors are found to be hereditary. Well-studied examples are the 
genes for retinoblastoma and p53, as well as several suppressor genes 
involved in colorectal cancer (Weinberg, 1991). Gene tests for these 
mutations are of obvious value for early diagnosis and treatment. 


V, Therapy of Genetic Diseases 

Gene therapy may not be of practical value in regular veterinary 
medicine in the foreseeable future because of the cost involved. But 
work with gene therapy in animals will nonetheless be important, as 
preparation for such therapy in humans. What needs to be shown in 
animals is: (1) that the gene is integrated in the target cells, (2) that it 
is expressed on a suitable level, and (3) that the gene does not harm the 
cells or the animal. The ideal would be to exchange the mutated gene 
with a wild-type gene. Techniques for doing this by homologous recom¬ 
bination in ES cells have been developed (Smithies et al 1985; Thom¬ 
as and Capecchi, 1986). 

Gene therapy can be conceived at different levels: introduction of 
DNA in the fertilized ovum or early embryo; in cells or tissues that are 
taken out, modified, and reimplanted; and in the whole animal, with 
the help of an infective vector. 

A. Germ Line Gene Therapy 

Several successful experiments in germ line gene therapy have been 
reported. Hammer et al. (1985) described now injection of the rat 
growth hormone (GH) gene, linked to the mouse metallothionein pro¬ 
moter in fertilized ova from dwarf mice with low GH levels, yielded 
mice with normal growth. The immune response has been restored to a 
tripeptide in mice from a line with a defective MHC class II E gene by 
injection of DNA containing this gene. (3-Thalassemia has also been 
successfully corrected by injection of (3-globin DNA in a murine model 
of human (3-thalassemia (Costantini et al ., 1986). 


B. Somatic Gene Therapy 

Human gene therapy has come of age (Miller, 1992; Friedman, 1993; 
Tolstoshev and Anderson, 1993). Protocols for treatment of 14 diseases 
have been approved (Wivel, 1993) based on results obtained with ani- 
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mals. The first disease treated was adenosine deaminase (ADA) defi¬ 
ciency, causing immunodeficiency. A retrovirus vector was used to in¬ 
troduce the normal cDNA in lymphocytes. Among the diseases recently 
attacked are cystic fibrosis and hypercholesterolemia. 

Cystic fibrosis affects about 1 in 2000 Caucasians. Mouse models for 
cystic fibrosis have been generated by homologous recombination in 
embryonic stem cells, in one case by replacement of part of the cystic 
fibrosis transmembrane conductance regulator (CFTR) gene with a 
mutated gene section (Snouwaert et aL, 1992), in the other case by 
insertion of a mutated gene in the host gene (Dorin et aL, 1992). The 
first type is a null mutation, while the latter has a small degree of 
leakiness. Hyde et aL (1993) showed that CFTR cDNA could be deliv¬ 
ered to the lungs of the replacement mutant by direct instillation of a 
cDNA-liposome cocktail, resulting in correction of the ion conductance 
defect. Other workers have used adenovirus as a vector. Unlike retro¬ 
virus, adenovirus is capable of infecting terminally differentiated cell 
types and is naturally drawn to airway epithelial cells. Good expres¬ 
sion of human CFTR cDNA, carried on replication-deficient ade¬ 
novirus, was observed after intratracheal introduction in cotton rats. 
Adenovirus is not integrated in host DNA. Studies have therefore been 
carried out on the safety and efficiency of repeated CFTR cDNA trans¬ 
fer in cotton rats and primates (Zabner et al., 1994). 

Another cDNA that has been carried on adenovirus into rat lung 
epithelium and expressed is cDNA for a : -antitrypsin (Rosenfeld et al ., 
1991). Adenovirus can also be used for muscle diseases. Vincent et al. 
(1993) have observed long-term correction of dystrophic degeneration 
in a mouse model for Duchenne type muscular dystrophy after intra¬ 
muscular injection of adenovirus carrying a human dystrophin mini¬ 
gene. 

Familial hypercholesterolemia (FH) affects individuals heterozygous 
for LDL receptor mutations (see Section V,B,3) and very severely, af¬ 
fects those who are homozygous. A strain of rabbits genetically defi¬ 
cient in LDL receptors was used to demonstrate the potential efficacy 
of ex vivo gene therapy. Part of the liver was resected and perfused 
with collagenase; the hepatocytes were transduced with LDL receptor 
retrovirus and infused in the rabbit liver, with good effect (Chowdhury 
et al ., 1991). The experiment was followed up using dogs and baboons. 
The first results in humans were reported by Grossman et al. (1994). A 
29-year-old woman, homozygous for the disease, underwent the same 
procedure, and her LDL/HDL ratio declined from 10-13 before treat¬ 
ment to 5-8 following gene therapy. 
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VI. Use of Recombinant DNA Methods 
to Improve Domestic Animals 

A. Markers and Maps 

One important use of DNA methods for the improvement of domestic 
animals has been discussed already: marker-assisted selection against 
animals that carry a disease gene that has been characterized at the 
molecular level. On the whole, however, specific genetic disorders are 
less important than traits affecting growth, reproduction, disease re¬ 
sistance, and the quality of the end product. These traits are usually 
determined by several genes by quantitative trait loci (QTLs). In order 
to enable marker-assisted selection for a QTL, it must be fairly precise¬ 
ly localized on a chromosome. This requires genetic and physical maps 
of high resolution. Much effort has gone into establishing such maps 
the last years. 

The work is furthest advanced for pigs. At the First Pig Gene Map¬ 
ping Workshop (PGM1) (Andersson et al ., 1993), the number of 

of these were anonymous DNA segments, 
mostly microsatellites, and the rest were polymorphic blood groups 
and protein polymorphisms. Rohrer et al. (1994) established a genetic 
linkage map with 376 microsatellite and 7 RFLP loci, using a two- 
generation reference population. The average distance between adja¬ 
cent markers was 5.5 cM. A genetic linkage map with somewhat lower 
resolution (11 cM average spacing), but with 60 reference markers for 
comparative mapping and 47 markers physically assigned by in situ 
hybridization, was reported by Ellegren et al. (1994), based on a cross 
between the European Wild Boar and a domestic breed (Large White). 
This material has been analyzed with respect to some quantitative 
traits, and evidence for QTLs with large effects on growth, length of 
the small intestine, and fat deposition was found on chromosome 4 
(Andersson et al., 1994). Comparative gene mapping for pig chromo¬ 
some 4 could indicate candidates for these QTLs because clusters of 
neighboring genes tend to be preserved between species. 

An example of the comparative approach was the focusing on the 
long arm of human chromosome 19 in the search for the malignant 
hyperthermia (MH) locus. It was known that the GPI gene was located 
here, and that GPI and MH were closely linked in pigs. Another strik¬ 
ing example is the observation of linkage for the human secreted phos- 
phoprotein 1 (SSP1) locus to the sheep Booroola fecundity (FecB) gene, 
and the subsequent finding of linkage to human epidermal growth 


mapped loci was 170. Sixty 
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factor and complement I genes, near SSP1 in 4q in humans. The latter 
genes define a candidate region on sheep chromosome 6 (Montgomery 
et al ., 1993). 

Good progress has also been achieved in genome mapping in cattle. 
Barendse et al. (1994) mapped 171 loci, with an average distance be¬ 
tween markers of 15 cM. Fifty-six loci represent DNA information from 
other species. An American-Swiss collaboration produced at the same 
time a cattle map with 313 polymorphic markers, with an average 
spacing of 8 cM (Bishop et al. 1994). An extensive comparison between 
cattle, cat, mouse, and humans is presented by O’Brien et al. (1993). A 
good foundation now appears to exist for the isolation of bovine QTLs, 
as well as for the study of mammalian genome evolution. 


B. Parentage Identification 

Errors may occur, for example, in marking semen portions used in 
artificial insemination. Piglets may jump from one pen to another and 
be falsely marked when marking takes place. It may, at times, be a 
temptation for a breeder to report a false father or mother. These 
problems can be addressed using core sequences for minisatellites, 
which hybridize with VNTRs of many genes, producing a DNA finger¬ 
print (Jeffreys et al., 1985b). Because each band represents a mini¬ 
satellite locus, the probability that two individuals would have the 
same pattern is nearly zero, except for monozygotic twins. A multilocus 
microsatellite probe, such as (GTG) 5 , yields a similar result, with bet¬ 
ter coverage of the genome (Mqrsch and Leibenguth, 1994). Alter¬ 
natively, one can combine the PCR results for a set of highly poly¬ 
morphic microsatellites (Marklund et al., 1994). 


C. Transgenic Animals 

When improving livestock by traditional breeding, one is limited to 
an exploitation of the alleles that are already present in the animal 
population. The results in 1982 of the introduction of extra DNA into 
pronuclei of fertilized mouse eggs suggested the possibility of a quicker 
and larger improvement of production animals. Experiments with in¬ 
jection of GH in lactating cattle had demonstrated a significant boost 
in milk yields. It was natural to ask: Can the same result be obtained 
with extra gene copies for the hormone? Will pigs with extra GH genes 
reach slaughter weight sooner, and with lower feed consumption? 

The first report on transgenic livestock animals appeared in 1985 
(Hammer et al.). In the following years, these experiments were ex- 
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tended. Clark et al ., in a 1992 review, lists 12 reports on transgenic 
pigs, 9 on sheep, and 5 on cattle. The average frequency of success 
(transgenic animals/eggs transferred) was 0.59% with pigs and 0.74% 
with sheep, compared to 2-5% routinely obtained with mice. This puts 
a severe constraint on the adoption of this technology. For sheep and 
cattle, however, a promising development is that viable embryos can be 
obtained by in vitro maturation and fertilization of oocytes removed 
from slaughterhouse ovaries (Lu et aL, 1987). 

1. Growth Regulation 

Most of the pig and sheep experiments involved GH genes from 
various species, and mostly with the mouse metallothionein promoter. 
Pursel et al. (1989) found concentrations of GH in expressing founder 
pigs in the range 14-4000 ng/ml for human GH. The variation could 
reflect the influence of chromosomal position for the inserted gene. To 
obtain a reliable assessment of the effect of the GH transgene on 
growth, Pursel et al. measured the average daily weight gain and feed 
efficiency for two generations of transgenic and control pigs. The com¬ 
bined transgenic progeny showed 11 % faster weight gain than the 
controls (p = 0 . 001 ), and there was a 16-18% increased feed efficiency. 
Equally dramatic was the effect on subcutaneous fat accretion. Mean 
backfat thickness was reduced from 21 to 7.5 mm. 

These favorable traits were offset by considerable deleterious side 
effects, many of which could be the result of the lasting, high concen¬ 
trations of GH. The most common clinical signs of disease were lethar¬ 
gy, lameness, uncoordinated gait, exopthalmos, and thickened skin. A 
number of pathologic changes were noted. What appears necessary is 
to achieve a tightly controlled GH gene expression, so that this can be 
confined to 1-2 months during the period of rapid growth. A promoter 
construct that is turned on by a substance added in the feed could solve 
the problem. 


2. Disease Resistance 

One route to specific disease resistance is the introduction of one or 
more immunoglobulin genes producing antibodies against a particular 
pathogen. Model experiments with mice show that this could work 
(Rusconi and K 0 hler, 1985). 

An interesting suggestion for reducing mastitis is to fuse the 
lysostaphin gene with regulatory elements from the betalactoglobulin 
(BLG) gene before microinjection in fertilized bovine eggs. The trans¬ 
genic cow would then produce lysostaphin in its udder. Lysostaphin 
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hydrolyzes the cell wall of Staphylococcus aureus . The idea is under 
testing in mice (Clark et al., 1992). 

In chickens, Salter and Crittenden (1989) have described a trans¬ 
genic line carrying a defective ALV retrovirus genome that expresses 
only the envelope proteins of the virus. This leads to resistance because 
the viral protein produced competes on the virus receptor binding sites 
on the cell surface. This is similar to a successful technique for making 
plants virus resistant. 


3. Wool Production 

In sheep, a major limitation in wool production is the availability of 
cysteine. Two Australian groups have cloned the necessary cysteine 
synthetase genes from bacteria and introduced them into sheep. At 
least one transgenic animal was generated (Rogers, 1990). 

4. Milk Modification 

Various ways of modifying milk protein composition can be envis¬ 
aged (Clark et al., 1992). Caseins could be altered in their phosphoryla¬ 
tion sites and thereby in micelle properties. Reduction in the amount of 
BLG would reduce allergies and inhibit the synthesis of lactose, a 
disaccharide not tolerated by 90% of the world population. It would 
also seem useful to try to introduce in cows the gene for human lac- 
toferrin, an antimicrobial agent, and iron transporter in order to make 
cow milk more suitable for human infants. 

5. Pharmaceutical Proteins 

The area of pharmaceutical proteins will probably never be a large 
part of commercial animal husbandry and breeding, but it deserves 
mention. Clark et al. (1992) have used regulatory sequences from 
sheep BLG to target expression of human factor IX and a 2 -antitrypsin 
(AAT) to the mammary gland. One sheep produced AAT at 30 g/liter, 
making up 50% of the milk proteins (Wright et al., 1991). The lack of 
AAT is one of the most common human genetic disorders. It leads to a 
life-threatening emphyzema, requiring repeated administrations of in¬ 
tact enzyme (200 g per patient per year). The source has hitherto been 
human blood plasma, with the limitations and dangers this entails. 

Milk is not the only possible vehicle for medically important proteins 
from transgenic animals. Swanson et al. (1992) have succeeded in mak¬ 
ing pigs transgenic for human globin genes, expressed in pig erythro¬ 
cytes. Human hemoglobin could be separated from pig hemoglobin by 
ion exchange chromatography. 
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6. Use of Embryonic Stem Cells 

A major limitation in the production of transgenic animals by micro¬ 
injection of DNA in fertilized ova is that there is no control over where 
in the genome the extra DNA finds its place. Sometimes it will disrupt 
a gene, introducing defects in development or function of the animal, it 
may land in a region where it is poorly expressed. Work with mice has 
shown that for ES cells in culture, one can achieve a site-specific inte¬ 
gration of the new DNA with the help of homologous recombination. 
Cells that are shown to have incorporated the gene properly are then 
introduced in the blastocyst. The result is chimeric animals, but in the 
next generation animals with the new gene in their germ cells can be 
found. For domestic livestock, the ES technique has met with some 
problems. However, Notarianni et al. (1990) described the derivation of 
apparently pluripotent cell lines from porcine and ovine blastocysts. 

In cattle and sheep (but not in mice), young have been born following 
transfer of nuclei from cells of the inner cell mass of blastocysts to 
enucleated oocytes (Marx, 1988). Therefore, if gene-modified ES cells 
can be established from these species, it may be possible to transfer 
nuclei from them and avoid the chimeric stage. 


VII. Legal and Ethical Aspects of Patenting Biotechnology 

The issue of patenting in our context refers most directly to patent¬ 
ing of transgenic animals (TGAs). When considering the ethics of such 
patents, it is necessary first to discuss the ethics of generating TGAs. If 
the very production of TGAs should be ethically problematic, patent¬ 
ing, being a stimulus for such procedures, would have to be considered 
even more problematic. One is faced with the question whether such 
manipulation is in line with the respect one should have for an ani¬ 
mal's worth and integrity. We have here a value conflict, a conflict 
between the worth of animals and the worth of humans. In my view, 
the good consequences for humans that result, or may result, from 
these gene manipulations are so great that these procedures are de- 
fendable if they proceed according to the prescriptions contained in 
animal welfare laws. 

I have, in preceding sections, outlined the importance of the experi¬ 
ments with transgenic mice for our understanding of fundamental bio¬ 
logical mechanisms in mammals and, thereby for our understanding of 
disease processes both in animals and in humans, and for the possi¬ 
bilities of preventing and curing disease. Improvement of domestic 
animal breeds must also be an important goal in a world that needs to 
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produce food for a quickly increasing population. The transgenic im¬ 
provement may be directed toward better animal health, a healthier 
composition of meat and milk, and a better feed efficiency. With regard 
to the third category of TGAs, it appears possible that a relatively 
small number of sheep, goats, or cattle will supply the world require¬ 
ments for important proteins, secreted in their milk, proteins that 
perhaps otherwise would have to be isolated from human blood, with 
the limitations and contamination dangers that entails. 

Animal genomes are manipulated, then, based on human needs. 
This is not a behavior that started in 1980, with the first TGA. It 
started with domestication several thousand years ago. Our present- 
day swine do not look much like the wild swine, most of our dog races 
not much like the wolf. But now, as before, we are manipulating a 
fenced-off, limited part of nature. Wolves and wild swine continue to 
evolve in freedom. 

Of course, one may not uninhibitedly manipulate the genome of an 
animal. The Norwegian animal protection law states: “Animals shall 
be treated well and consideration taken for their instincts and natural 
needs.” It cannot be claimed that classical breeding has always fol¬ 
lowed this aim. Some animals have been given new burdens. Is there a 
danger that the use of DNA techniques to produce better production 
animals will entail additional burdens? Possibilities for this are pres¬ 
ent when one interferes with a balanced system. There are indeed, as 
mentioned, reported examples of this for transgenic pigs. During a 
research period, this may be acceptable. But before a TGA is put in 
regular production, a thorough analysis of its anatomy, physiology, 
and behavior must demonstrate that it does not suffer from some new 
stress factor. In the Norwegian gene technology law of 1993, the ani¬ 
mal protection law was revised to cover these concerns. Work with 
experimental animals like mice and rats has a special legal status. The 
experiments must be approved and certain procedures followed to min¬ 
imize pain. 

Does patenting of TGAs introduce additional ethical problems? The 
patent system exists to stimulate inventors to come up with new solu¬ 
tions to practical problems. If the patenting of transgenic sheep that 
produces a 1 -antitrypsin stimulates the generation of such sheep, it 
would be unethical to prohibit such a patent. A number of objections 
have nevertheless been raised against TGA patents. One is the view 
that it will reduce the access of developing nations to transgenic 
breeds, when such become available. An answer to this is that without 
patentability, the transgenic production animal may never be devel¬ 
oped. An existing patented TGA breed may, furthermore, find its way 
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to a developing nation if licensing fees are paid by a foreign aid organi¬ 
zation and if farmers’ privilege for use of offspring is granted. 

Animal rights protagonists have claimed that patenting will in¬ 
crease the suffering of experimental animals (Raines, 1991). But the 
patent system is not created to be a substitute for regulatory authority. 
If existing regulations are inadequate to protect animal welfare, they 
should be changed. Others argue that patenting of life reflects a domi¬ 
neering and materialistic attitude toward living creatures. Is it more 
materialistic than our domestication of animals? In our culture, ani¬ 
mals are bought and sold, owned and used, slaughtered and eaten. 
Whether they are patented or not, they should be handled with care 
and compassion. 
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I. Introduction 

Our current understanding of the mechanisms underlying on¬ 
cogenesis has evolved from critical advances made in the diverse fields 
of genetics, cytogenetics, virology, cell biology, and molecular biology. 
Although genetics has revealed that cancer can be heritable, the study 
of tumor viruses has shown that cancer genes, or oncogenes, can arise 
from normal cellular genes, or proto-oncogenes; this lesson is under¬ 
scored by cytogenetic studies of chromosomal abnormalities, which 
have guided us to the genomic location of many important proto-on¬ 
cogenes. Molecular biology has made it possible to clone, sequence, and 
manipulate cancer-causing genes while cellular biology has revealed 
their normal and oncogenic functions (Klein and Klein, 1985). Now 
that we have a firm grasp of many of the mechanisms of oncogenesis, 
we are able to begin developing therapies for some cancers at the 
genetic level. After considering how genes involved in oncogenesis are 
detected and isolated, discussing some of their normal and oncogenic 
functions, and presenting techniques used to elucidate these functions, 
we will provide an introduction to these initial approaches to gene 
therapy directed against cancer. 

We now know that oncogenesis is usually a multistep process driven 
by the accumulation of genetic alterations, usually over the span of 
years, which culminates in deregulation of normal cellular controls on 
differentiation, division, and proliferation. Accumulation of additional 
genetic changes within the transformed population of cells leads to the 
production of metastasis-competent tumor cells. Insults to the genome 
leading to cancer can be in the form of chromosomal rearrangement, 
deletion, point mutation, or amplification. In animals and less fre¬ 
quently in humans, viral infection can also be involved in oncogenesis, 
occasionally resulting in acute transformation but more commonly 
playing an indirect role. Many types of genetic lesions are repaired at 
high frequency, but those that are not correctly repaired result in 
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daughter cells harboring a genetic alteration. Frequently, mutations 
are deleterious and novel cells containing them die. Occasionally, mu¬ 
tations confer a growth advantage by disrupting any of a variety of 
normal cellular processes and thus initiate the process of oncogenesis. 
In deoxyribonucleic acid (DNA) tumor virus infection, viral gene ex¬ 
pression can stimulate growth of the infected cell. Once a population of 
cells possessing a growth advantage is established, additional muta¬ 
tions can occur in individual cells leading to the acquisition of addition¬ 
al characteristics necessary for the development of the cancerous, or 
transformed, phenotype. Such mutations include those that can cause 
the cell to respond to a broader array of growth factors or obviate the 
need for growth factors entirely and those that can release the cell 
from normal controls of cell division and senescence. 

Mutations that directly promote oncogenesis affect genes that are 
either oncogenes or tumor-suppressor genes. Oncogenes are domi¬ 
nantly acting tumorigenic versions of normal cellular genes involved in 
signal transduction identified largely by the study of the acutely trans¬ 
forming retroviruses of animal tumor model systems. In contrast, 
tumor-suppressor genes normally function to suppress cellular growth 
and thus loss or inactivation of both copies is required for transforma¬ 
tion. Tumor-suppressor genes have been identified primarily through 
the study of inherited human cancer syndromes. Most known tumor- 
suppressor genes encode transcription factors, but some encode cyto¬ 
plasmic or cytoskeletal-associated proteins involved in signal trans¬ 
duction. Inactivation of DNA repair enzymes is indirectly involved in 
oncogenesis because loss of both copies of some DNA repair genes leads 
to cancer by increasing the number of mutations that escape detection 
and repair in the proto-oncogenes and tumor-suppresser genes during 
DNA synthesis. Another broad class of genes, metastasis-associated 
genes, are also not directly tumorigenic but are involved in cancer 
because they render tumor cells competent to migrate to, invade, and 
colonize a secondary site. However, there is evidence that expression of 
at least some transforming oncogenes may confer all or many of the 
phenotypic changes associated with metastasis by virtue of aberrant 
expression of a normal developmental genetic program. 


II. Detection of Genes Involved in Oncogenesis 

Oncogenes and tumor-suppressor genes, including genes encoding 
some of the DNA repair enzymes, have been identified by four primary 
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approaches: (1) DNA-mediated gene transfer in cell culture, (2) identi¬ 
fication of oncogenes in the transforming retroviruses, (3) study of the 
transforming genes in DNA tumor viruses, and (4) identification of 
genes at nonrandom sites of chromosomal rearrangement associated 
with malignancies and heritable cancer predisposition syndromes. The 
transforming potential of many of the known proto-oncogenes and 
tumor-suppressor genes has often been revealed by more than one 
approach, underscoring the pivitol role these genes play in on¬ 
cogenesis. 

A. DNA-Mediated Gene Transfer in Cell Culture 

DNA transfection, or gene transfer, was first used to identify the 
transforming genes of ribonucleic acid (RNA) and DNA tumor viruses 
(see Section II,C) and subsequently to identify the transforming genes 
in a variety of tumor cell lines (Hill and Hillova, 1972). In this assay, 
DNA sequences suspected of harboring oncogenic properties are trans¬ 
fected into immortal but nontransformed cells by any of a variety of 
techniques. Usually, murine fibroblasts such as NIH/3T3 or Swiss3T3 
cells are the recipient cell lines of choice because they can be main¬ 
tained as contact-inhibited, nontumorigenic cell lines. After transfec¬ 
tion, the recipient cells are monitored for morphological transforma¬ 
tion resulting from loss of contact inhibition by testing for focus 
formation and ability to grow in soft agar (Blair et al, 1982). 

Once a transformed colony of cells has been established, the foreign, 
transforming DNA can be recovered and molecularly cloned and stud¬ 
ied to determine the basis for its transforming effect. Many oncogenes 
from mouse and human tumor cell lines have been identified using this 
approach (Goldfarb et al ., 1982). 

Many of the transfectable human transforming genes were shown to 
be members of the ras family of oncogenes. These oncogenes were first 
identified in acutely transforming Harvey and Kirsten sarcoma viruses 
and were designated c-H-ras and c-K-ras (Der et al., 1982). A ras gene 
family member identified in a human neuroblastoma cell line and a 
human promyelocytic leukemia cell line were designated N-ras (Mur¬ 
ray et al, 1981). Approximately 15% of human tumor cell lines and 
fresh tumor biopsies have activated ras oncogenes as detected by this 
assay, and this number may be much higher in some human tumors, 
such as colorectal cancers (Bos et al, 1987). Ras appears to exert its 
oncogenic effect by activating kinases involved in signal transduction. 
This process is described more fully later. 

A number of novel transforming genes that are not members of the 
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ras gene family or related to the viral oncogenes of the acutely trans¬ 
forming retroviruses have been identified by DNA transfection assays. 
These include the neu, met, trk , mas , HST, and KS3 oncogenes (Perkins 
and Vande Woude, 1993; Benz et al., 1993; Reddy et aL, 1988; see Table 
1). With the exception of the mas oncogene, which appears to be a 
unique integral membrane protein, the other oncogenes are members 
of either the tyrosine kinase growth factor receptor family (neu, met , or 
trk) or the fibroblast growth factor (FGF) family (HST, KS3). 

B. Retroviral-Mediated Oncogenesis 

The first oncogenes were identified in studies of cancer-causing ret¬ 
roviruses and their continued study has led to the identification of a 
variety of oncogenes and proto-oncogenes. Retroviruses can activate 
proto-oncogenes in two primary ways: (1) by insertion into the host 
genome at the transcriptional level and (2) by transduction of activated 
oncogenes resulting from proviral recombination with the host genome 
in such a way that a portion of a proto-oncogene is packaged in the 
viral particle as part of the proviral genome (Varmus, 1988) Retroviral - 
mediated oncogenesis by viruses carrying transduced oncogenes is 
quite rapid and thus these viruses are called the acutely transforming 
retroviruses. In human disease, retroviruses can cause transformation 
by insertional activation but no naturally occurring human retrovirus 
has been found that transduces a human oncogene. However, the on¬ 
cogenes and proto-oncogenes identified by studying the acutely trans¬ 
forming retroviruses of animals are also involved in nonviral on¬ 
cogenesis in humans. 

1. Activation of Proto-oncogenes by Proviral Insertion 

The stable integration of the provirus into the host chromosome is an 
important step in the retrovirus life cycle. One potential consequence 
of proviral integration is the disruption of host genes, which can result 
in cellular transformation if a cellular gene controlling growth or dif¬ 
ferentiation is located at the site of integration. Because integration is 
essentially random and animal genomes are large, the likelihood of 
disrupting both copies of a single, dominantly acting gene is low. Thus, 
although proviral disruption of wild-type tumor-suppressor genes is 
possible, this has not yet been observed. 

Because retroviruses carry their own transcriptional regulatory se¬ 
quences in their so-called long terminal repeats (LTRs), random inser¬ 
tion of the provirus necessarily involves the random insertion of tran¬ 
scriptional promoters, which can alter expression of host genes near 



TABLE 1 
Oncogenes 


Oncogene 

Method of Identification 

Associated Tumor 

Growth Factors 
v-sis 

SSV transduction (monkey) 

Glioma/fibrosarcoma 

int- 2 

MMTV proviral insertion (mouse) 

Mammary carcinoma 

U2 

GaLV proviral insertion (ape) 

T-cell lymphoma 

ilS 

IAP proviral insertion (mouse) 

Myelomonocyte leukemia 

ks3 

DNA transfection 

Kaposi’s sarcoma 

hst 

DNA transfection 

Stomach carcinoma 

Integral Membrane Tyrosine 

Kinases 


\-fms 

SM-FeSV transduction (cat) 

Sarcoma 

v-erbB 

AEV-H transduction (chicken) 

Erythroleukemia/sarcoma 

c-er6B 

ALV proviral insertion (chicken) 

Erythroleukemia 


Amplification (human) 

Glioblastoma 

c-er6B2 

Amplification (human) 

Mammary carcinoma 

\-kit 

HZ4-FeSV transduction (cat) 

Sarcoma 

v-ros 

UR2 transduction (chicken) 

Sarcoma 

neu 

DNA transfection (rat) 

Neuroblastoma/carcinoma 

met 

DNA transfection 

Carcinomas/sarcomas 

trk 

DNA transfection 

Colon carcinoma 

Membrane-Associated T^rosi 

ne Kinases 


v-src 

RSV transduction (chicken) 

Sarcoma 

v-yes 

Y73/ESV transduction (chicken) 

Sarcoma 

v-fgr 

Gr-FeSV transduction (cat) 

Sarcoma 

v-fps 

FuSV transduction (chicken) 

Sarcoma 

v-fes 

STV transduction (cat) 

Sarcoma 

v-abl 

HZ2-FeSV transduction (cat) 

Sarcoma 

v-abl 

Ab-MLV transduction (mouse) 

Leukemia 



Serine-Threonine Kinases 

v-mos 

Mo-MSV transduction (mouse) 

Sarcoma 

c -mos 

IAP proviral insertion (mouse) 

Plasmacytoma 

v-raf 

MSV-3611 transduction (mouse) 

Sarcoma 

Ras-Reiated 

v-H-ras 

Ha-MSV (rat) 

Sarcoma 


MAV proviral insertion (chicken) 

Nephroblastoma 


Point mutation (human) 

Bladder and other carcinomas 

v-K-ras 

Ki-MSV (rat) 

Sarcoma 


Point mutation (human) 

Carcinomas and leukemias 

N-ras 

DNA transfection (human) 



Point mutation (human) 

Myeloid leukemia 

Nuclear Proteins/ 

Transcription Factors 

v-myc 

MC29 transduction (chicken) 

Carcinoma; myeloid leukemia 

c-myc 

ALV, CSV, REV proviral insertion (chicken) 

Bursal lymphoma 


MLV (mouse) and FeLV (cat) proviral inser¬ 
tion 

T-cell lymphoma 


Amplification (human) 

Small-cell lung carcinoma 

N -myc 

Amplification (human) 

Neuroblastoma 

Small-cell lung carcinoma 

L -myc 

Amplification (human) 

Small-cell lung carcinoma 

v-myb 

E26 (chicken) 

Erythroleukemia 

c-myb 

MLV proviral insertion (mouse) 

Lymphosarcoma 

v-fos 

FBJ-MSV (mouse) 

Osteosarcoma 

v-jun 

ASV transduction (chicken) 

Leukemia 

v-rel 

REV-T (turkey) 

Lymphatic leukemia 

v-ets 

E26 (chicken) 

Erythroleukemia 

v-er6A 

AEV-ES4 (chicken) 

Erythroblastosis 

v-ski 

SK-ASV (chicken) 

Carcinoma 

Others 

mas (transmembrane) 

DNA transfection (human) 

Mammary carcinoma 

int-1 

MMTV proviral insertion (mouse) 

Mammary carcinoma 
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the site of insertion. LTRs are very strong promoters of transcription; 
proviral integration can thus up-regulate host gene expression. Over¬ 
expression or aberrant expression of a proto-oncogene can have on¬ 
cogenic consequences, frequently giving rise to leukemias or lympho¬ 
mas (Hayward et aL , 1981). Thus, these viruses are named leukemia or 
leukosis viruses. Insertional activation of proto-oncogenes occurs fre¬ 
quently in animal tumors, but it does not appear to play a significant 
role in human tumors (see Section II, B, 3). The avian leukosis virus 
and the mouse mammary tumor virus are well-studied examples of 
retroviruses that transform by insertional activation (Weiss et aL, 
1982). The long latent period for disease caused by these retroviruses 
is partially the result of the low probability that proviruses will inte¬ 
grate into or adjacent to a host cellular proto-oncogene. Many on¬ 
cogenes have been identified on the basis of insertional activation, 
including ras, mos, wnt- 1, erbB, fms, myc, and others (see Table 1). 

2. Oncogene Transduction by Acutely Transforming Retroviruses 

In contrast to retroviruses that transform via insertional activation 
of cellular proto-oncogenes, acutely transforming retroviruses can pro¬ 
duce tumors in newborn animals in less than two weeks. The differ¬ 
ence in disease latency observed with these two virus classes is due to 
the fact that the acutely transforming retroviruses have acquired host 
genes or portions of genes that cause rapid transformation (Bishop, 
1985). These host-derived genes are called viral oncogenes ( w-onc ), and 
many have been identified in acute transforming retroviruses (see Ta¬ 
ble 1). Because the maximum length of the proviral genome that can be 
packaged in a retroviral virion is relatively fixed, some of the viral 
genes are lost when host genes are transduced into the retroviral ge¬ 
nome. As a result, the acutely transforming retroviruses usually are 
replication defective such that spread of the virus requires co-infection 
with a closely related, replication-competent, helper virus. The helper 
virus is required to provide the viral gene products necessary for viral 
replication and production of infectious viral particles containing the 
transforming viral genome. A notable exception is the Rous sarcoma 
virus (RSV) which carries the v-src oncogene. Fortunately, no acutely 
transforming human retroviruses have yet been identified, although 
proviral sequences comprise a significant percentage of the human 
genome. This lack is usually attributed to the paucity of actively repli¬ 
cating endogenous human retroviruses available to support the repli¬ 
cation of replication-deficient, transforming retroviruses. 

Although the acutely transforming retroviruses have little impact on 
human health, the characterization of their v-onc sequences and sub- 
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sequent identification of the proto-oncogenes from which they were 
derived have led to the discovery of a large number of transforming 
genes that are activated by non-viral mechanisms in human cancers 
(see Table 1). Additionally, study of the acutely transforming retro¬ 
viruses has greatly enhanced our understanding of oncogenic activa¬ 
tion and mechanisms of transformation. Retroviral replication is char¬ 
acterized by a high mutation rate of the viral genome. This high 
mutation rate, coupled with selection for increased tumorigenic poten¬ 
tial, can result in viruses with numerous changes in the v-onc se¬ 
quences. Multiple types of genetic aberrations have been identified, 
including multiple point mutations, deleted exons, and deleted tran¬ 
scriptional and posttranscriptional regulatory elements (Curran et aL, 
1984). Many v-onc genes are expressed as fusion products with viral 
genes and are transcribed at high levels under the constitutive control 
of the retroviral LTR (Bishop, 1985). The fusion products contribute to 
transforming potential by misdirecting an oncogene product to an im¬ 
proper cellular location or by deleting or adding a regulatory domain. 
Moreover, the target cell specificity of the retrovirus can result in the 
expression of the oncogene in an inappropriate cell type. 

3. Human Retroviruses in Oncogenesis 

As noted previously, human cancers do not routinely arise from ei¬ 
ther proviral insertion or oncogene transduction. However, human ret¬ 
roviruses do exist and can contribute to oncogenesis. The transforming 
potential of both the human immunodeficiency viruses, type 1 and 2 
(HIV-1 and -2) and the human T-cell leukemia viruses, I and II (HTLV-I 
and -II) is low; transformation is infrequent and occurs only after very 
long periods of latent infection. The mechanism of transformation for 
the HIVs and the HTLVs is largely indirect and probably results from 
the action of virally encoded regulatory proteins designed to control 
virus replication by regulating the expression of cellular genes and the 
proliferation of host cells (Kieff, 1996). 

Although HTLV-I and -II are thought to have similar mechanisms of 
infection, only HTLV-I is frequently associated with human malignan¬ 
cy. A very low percentage of HTLV-I infected individuals develop adult 
T-cell leukemia (ATL; Stewart et aL, 1994). ATL usually presents as a 
single, clonal tumor after several decades of infection, although a vast 
number of T cells are infected during the long disease progression. 
These observations argue strongly for an indirect transforming mecha¬ 
nism. The HTLV-encoded Tax transcription factor alters expression of 
known proto-oncogenes such as c-myc, c-fos, JunB, JunD, and EGR 1, as 
well as the growth factor/receptor pair interleukin-2 and granulocyte- 



60 


CORTNER, VANDE WOUDE, AND VANDE WOUDE 


macrophage colony-stimulating factor. The HTLV-encoded protein Rex 
also contributes to alterations in normal cellular proliferation by inter¬ 
fering with cellular mRNA processing. HTLV glycoproteins are ex¬ 
pressed on the surface of infected cells and may chronically stimulate its 
receptor, the T-cell receptor. These alterations of normal cellular events 
and possibly other less well-understood virally-mediated processes are 
believed to predispose HTLV-I infected T cells to transformation. 

The HIVs are most notable as the causative agent of the human 
acquired immunodeficiency syndrome (AIDS). However, the HIVs are 
also associated with some human malignancies, the most common of 
these being Kaposi’s sarcoma and Hodgkin’s and non-Hodgkin’s lym¬ 
phomas (Weller, 1994). The most likely mechanisms of transformation 
by the HIVs probably result from the action of auxiliary viral proteins, 
as with HTLV-I, and from the immunosuppressive effects of HIV infec¬ 
tion. Interestingly, some lymphomas from AIDS patients may also 
overexpress the fps proto-oncogene as a result of proviral insertion. If 
so, this represents the first instance of insertional activation by a hu¬ 
man retrovirus (Shiramizu et al. f 1994). 

C. DNA Tumor Viruses 

A variety of non-retroviruses are associated with human tumors. Of 
these, only the Epstein-Barr virus (EBV) is acutely transforming and 
all require long periods of infection before the appearance of the associ¬ 
ated tumors (Rickinson, 1994). In almost all cases, the mechanism of 
oncogenesis is believed to be indirect and result from secondary cellu¬ 
lar changes occurring in hyperproliferating, infected cells or in cells 
proliferating as part of tissue regeneration following immune clearing 
of infected cells. For EBV, overexpression of the viral protein LMP1 is 
transforming as a result of an unknown mechanism affecting prolifera¬ 
tion of latently infected cells. 

Papovaviruses, composed of the polyomaviruses and the papillo¬ 
maviruses, also cause hyperproliferation of infected cells; the mecha¬ 
nisms papovaviruses use for overriding cellular proliferation controls 
involve inactivation of tumor-suppresser gene products such as pl05RB 
and p53 or activation of cellular proto-oncogene products by complex- 
ing with viral proteins. Polyoma, SV40, and adenovirus are all trans¬ 
forming polyomaviruses. The polyoma viral oncoprotein middle-T anti¬ 
gen binds to the src proto-oncogene product and activates its kinase 
activity by blocking a downregulating phosphorylation event (Court¬ 
neidge, 1985). The polyoma viral oncoprotein middle-T antigen binds 
to and inactivates p53. SV40-mediated transformation requires SV40- 



GENES INVOLVED IN ONCOGENESIS 


61 


encoded large-T antigen capable of complexing with the products of 
cellular p53 and RB tumor-suppresser genes. Adenovirus E1A on¬ 
coprotein also binds to and inactivates cellular pl05RB (De Caprio et 
al. } 1988; Whyte et al. y 1988). 

The human papillomaviruses (HPVs) cause hyperproliferation of in¬ 
fected epithelial cells of the mid-epidermis producing benign lesions 
that can occasionally develop into tumors, the most common of these 
being cervical carcinomas (Howley, 1993). The viral oncoproteins re¬ 
sponsible for causing hyperproliferation and predisposition to malig¬ 
nant transformation are encoded by the E6 and E7 genes. E6 forms 
complexes with p53 and causes its degradation. E7 binds pl05RB and 
related proteins causing the E2F transcription factor normally se¬ 
questered by pl05RB during certain times in the cell cycle to be re¬ 
leased and thus competent to support gene transcription throughout 
the cell cycle. There are over 100 HPVs with a range of transforming 
potentials; the most highly transforming HPVs (HPV-16 and HPV-18) 
bind these cellular proteins more efficiently than the equivalent pro¬ 
teins encoded by the less-transforming HPVs, supporting the idea that 
these interactions are transforming. 

D. Oncogenesis Associated with 
Chromosomal Rearrangements 

Nonrandom chromosomal abnormalities have proven invaluable for 
the identification of genes involved in oncogenesis, and for providing 
diagnostic clues for certain tumors. Interestingly, a correlation exists 
between the type of chromosomal abnormality and the histopathologic 
type of tumor in which it is found, suggesting that certain cell types are 
particularly susceptible to transformation by the specific oncogene(s) 
created by the corresponding chromosomal rearrangement. Most of the 
abnormalities characterized to date occur in hematologic malignan¬ 
cies. This phenomenon is widely attributed to the relative ease with 
which these tumors can be grown in culture, but may also be caused by 
a higher frequency of productive chromosomal abnormalities resulting 
from errors occurring during normal, somatic rearrangements in these 
cell types. It has also been postulated that hematopoietic cells are 
particularly sensitive to transformation because of the delicate balance 
of regulatory factors involved in the process of hematopoietic differen¬ 
tiation (Look, 1995). Nonrandom abnormalities in chromosomes from 
solid tumors have also been described and have revealed the identity of 
many oncogenes. 

Table 2 lists the chromosomal rearrangements for which the resi- 



TABLE 2 

Oncogenes Activated by Chromosomal Rearrangements 


Oncogene 

Function 

Rearrangement 

Associated Tumor 

bcl-2 activation 

Prevents apoptosis 

t(14:18)(q32;q21) 

Follicular center cell lymphoma 

bcr-abl 

p210 abl tyr kinase 

t(9;22)(q34;ql 1) 

Chronic myelogenous leukemia 

bcr-abl 

pl80 abl tyr kinase 

t(9;22)(q34;qll) 

(T-ALL) a 

77g-l/Rhombotin 

Zn transcription factor 

t(ll,14)(pl5;qll) 

T-ALL 

lyl -1 

HLH transcription factor 

t(7;19)(q34;P13) 

T-ALL 

SCL(tal;TCL-5) 

HLH transcription factor 

t(l;14)(p32;qll) 

T-ALL 

SCL / SIL 

Interrupter locus 

del(l) p33 

T-ALL 

Hox -11 

Homeobox transcription factor 

t( 10; 14) 

T-ALL 

TAN-l 

Notch homolog 

t(7;9) 

T-ALL 

E2A -prl 

Homeobox transcription factor 

t(l;19)(q23;pl3.3) 

Pre B ALL 

PRAD-1, bel-l 

Cyclin D/cell cycle regulation 

t(ll; 14)(ql3;l 32) 

B-cell lymphoma: breast and other carcinomas 

eye D1 

Cyclin D/cell cycle regulation 

Amplification 

B-cell lymphoma: breast and other carcinomas 

c -myc Ig locus 

Transcription factor 

t(8;14)(q24;q32) 

Burkitt’s lymphoma 

APL-RARcx 

Deregulation gene expression 

t( 15;17) 

Acute promyelocytic leukemia 

PR AD -1 

Cyclin D/cell cycle regulation 

inv(ll) 

Parathyroid adenoma 

G1 cyclin 

Cell cycle regulation 

inv(ll) 

Breast, squamous cell carcinomas 

c -myc 

Transcription factor 

Amplification 

Small-cell lung carcinoma 

L -myc 

Transcription factor 

Amplification 

Small-cell lung carcinoma 

N -myc 

Transcription factor 

Amplification 

Small-cell lung carcinoma 

N-myc 

Transcription factor 

Amplification 

Neuroblastoma 

c-er6B 

tyr kinase GFR 

Amplification 

Glioblastomas 

c-er6B2 

tyr kinase GFR 

Amplification 

Mammary carcinoma 

iher2\neu) 

hst 

Growth factor 

Amplification 

Gastric carcinoma 

mdm-2 

p53 binding protein 

Amplification 

Sarcomas 

gli 

Transcription factor 

Amplification 

Sarcomas, gliomas 

cdk\ 

Cyclin-dependent kinase 

Amplification 

Sarcomas, gliomas 

ret 

tyr kinase GFR 

Translocation 

Thyroid carcinoma 

trk 

tyr kinase GFR 

Translocation 

Colorectal carcinomas 


a T-ALL: T-cell acute lymphoblastic leukemia 
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dent genes have been identified and cloned. These rearrangements 
include translocations (exchange of material between two chromo¬ 
somes), inversions (rearrangement of material within a chromosome), 
deletions (net loss of a portion of a chromosome), and duplication or 
amplification (both causing net gain of chromosomal segments; Trent, 
1995). Aberrations in normal chromosome recombination and replica¬ 
tion that produce such chromosomal rearrangements occur at a much 
higher frequency (between 1 and 10 per 1000 cell divisions) than the 
rate of accumulation of uncorrected point mutations in most genes 
within the normal human genome (about 1 to 10 per million cell divi¬ 
sions; Weigel, 1995). Thus, it is not surprising that chromosomal rear¬ 
rangements underlie many of the genetic alterations involved in on¬ 
cogenesis. 


1. Translocation 

Activation of proto-oncogenes by chromosomal rearrangements can 
occur by creation of novel fusions between segments of genes resulting 
from translocations. As in the case of proto-oncogene activation by 
acutely transforming retroviruses, the translocations can be activating 
by creating a novel fusion protein or by up-regulating expression of a 
proto-oncogene by positioning it downstream from a transcriptional 
promoter. Many translocations involve either the T-cell receptor (TCR) 
or immunoglobulin (Ig) loci, in T- and B-cell malignancies, respectively. 
Sequence analysis of the joint between these loci and the foreign loci 
reveal the presence of canonical heptamer-nonamer sequences, which 
are used during normal somatic rearrangement of the TCR and Ig 
genes, and indicate that the translocation most likely arose by a recom¬ 
bination error. By juxtaposing the proto-oncogene with the TCR or Ig 
loci, it becomes transcriptionally deregulated, structurally altered, or 
both, rendering the novel gene product oncogenic. 

One common feature in several of the genes that are deregulated by 
translocation in hematopoietic malignancies is that they encode pro¬ 
teins with significant homology to known transcriptional regulatory 
proteins (see Table 2; Cleary, 1991). For example, myc, lyTl, and 
tal / scl are all transcription factors containing helix-loop-helix dimer¬ 
ization motifs. The t(15; 17) translocation characteristic of acute pro- 
myelocytic leukemia (APL) results in the deregulated production of a 
chimeric protein containing much of the retinoic acid receptor a, which 
normally functions as a transcription factor in the presence of its lig¬ 
and (de The et aL, 1990). The deregulated expression of this chimeric 
factor disrupts normal cellular events by altering the expression of 
genes regulated normally by the proto-oncogene product. 
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About 95% of chronic myelocytic leukemias (CML) result from a 
translocation between chromosomes 9 and 22, producing the so-called 
Philadelphia chromosome. In this translocation, the c-a&/ proto-on¬ 
cogene is translocated from chromosome 9q34 to a breakpoint cluster 
region (bcr) in chromosome 22qll. The bcr region provides a transcrip¬ 
tional promoter and the 5' end of a fusion gene that encodes the kinase 
domain of the abl proto-oncogene at its 3' end. The novel fusion pro¬ 
tein, bcr-a6/ protein kinase, causes a CML-like disease when the 
bcr/a6/ gene is expressed in transgenic mice (Collins et al., 1984; 
Heisterkamp et al., 1990) It appears that bcr-encoded sequences in the 
chimeric protein activate the normal c-abl tyrosine kinase, which is 
known to be involved in signal transduction, by direct physical binding 
of the kinase regulatory domain of abl (Pendergast et al., 1991). 

2. Amplification 

In a process known as amplification, activation of proto-oncogenes 
can occur by duplication of a chromosomal segment spanning a proto¬ 
oncogene. Chromosomal amplification of a proto-oncogene increases the 
dosage of the proto-oncogene proportional to the number of copies of the 
gene and results in overexpression of the gene product. Several cellular 
oncogenes are amplified in human tumors (see Table 2). The c-myc 
proto-oncogene locus is amplified in a promyelocytic leukemia both in 
the primary tumor and in the cell line HL-60, which was derived from 
the tumor (Dalla Favera et al., 1982). Other oncogenes, such as c-erbB 
(EGFR), neu (HER-2), and c-myc family members, have been shown to 
be amplified in specific tumor types (see Table 2), and the presence of 
multiple copies of these genes has been associated with poor prognosis. 
The presence of multiple copies of N-myc (which was first identified as 
an amplified gene in human neuroblastoma) correlates with advanced 
stages of the disease. Likewise, the amplification of myc family members 
in small-cell lung carcinoma is also associated with a more aggressive 
disease progression (Little et al., 1983). Thus, the myc family members 
appear to be associated with the progression of neuroblastomas and 
small-cell lung carcinomas, whereas the c-erbB, or EGFR, gene is ampli¬ 
fied in glioblastomas and squamous carcinomas. 

3. Chromosomal Deletion and Loss of Heterozygosity 

DeMars postulated in 1970 that cancer-prone individuals are hetero¬ 
zygous for the cancer-predisposing gene and that cancer develops in 
the individual because of a somatic mutation or loss of the remaining 
normal allele (DeMars, 1970). Knudson developed a mathematical 
model based on his observations on the incidence and age of onset of 
unilateral and bilateral cases of familial retinoblastoma (Knudson, 
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TABLE 3 

Tumor Suppressor Genes 


Tumor Suppressor 

Gene and Gene Family 

Syndrome 

Heritable and Sporadic Tumors 

Transcription Factors 

RB 

Familial retinoblastoma 

Retinoblastoma; osteosarcoma; 
soft tissue sarcomas; breast 
and small-cell lung carcinoma 

WT1 

Familial Wilm’s tumor 

Wilm’s tumor; hepatoblastoma 

p53 

Li-Fraumeni syndrome 

Many sarcomas and carcinomas 

BRCA1 

Familial breast cancer 

Breast and ovarian carcinomas 

BRCA2 

Familial breast cancer 

Breast carcinoma 

DNA Repair Enzymes 

ERCC (nucleotide 

Xeroderma 

Skin cancer 

excision repair) 

pigmentosum (XP) 


FACC (cross link 
repair) 

Fanconi’s anemia (FAP) 

Leukemia 

DNA ligase I 

46BR 

Lymphoma 

(DNA ligase activity) 

Bloom syndrome (BS) 

Carcinomas 

(X-ray response) 

Ataxia telangiectasia 
(AT) 

Lymphoma 

hMLHl (DNA 

Lynch syndrome 

Nonpolyposis colon cancer 

mismatch repair) 

(HNPCC type 1) 


hMSH2 (DNA 

Lynch syndrome 

Nonpolyposis colon cancer 

mismatch repair) 

Cytoplasmic/Cytoskeletal 
Proteins 

(HNPCC type 2) 


APC 

Familial adenomatous 
polyposis (FAP) 

Colorectal cancer 

NF1 

Neurofibromatosis type 

Neurofibroma 


1 (NF1) 

N eurofibrosarcoma 

Merlin 

Neurofibromatosis type 

Acoustic neuroma; meningioma; 


2 (NF2) 

glioma; schwannoma 

Others 

ret (tyr kinase GFR) 

Multiple endocrine 

Thyroid carcinoma 


neoplasia type 2A 
(MEN2) 

Pheochromocytoma 


1971). This model predicted that one additional mutation was the rate- 
limiting step in the development of tumors in familial cases, whereas 
two events were needed in nonfamilial cases. This hypothesis fit with 
both the high incidence of bilateral tumors in familial cases and the 
earlier age of onset of cancer in the familial cases. The list of the 
heritable types of cancer to which Rnudson’s model of mutation applies 
continues to grow (Weigel, 1995; Benz 1993; see Table 3). In heritable 


66 


CORTNER, VANDE WOUDE, AND VANDE WOUDE 


cancers, the transmitted, germ line mutation can be either a point 
mutation or a large or small deletion. Usually, somatic loss of the 
remaining wild-type allele, which results in cellular transformation, 
arises by chromosomal deletion or loss rather than by point mutation 
because deletions in and loss of chromosomes occur far more frequently 
than do uncorrected point mutations at transforming positions within 
genes. In cytogenetic studies, loss of heterozygosity (LOH) refers to the 
loss of a wild-type allele by loss or deletion of a region of a chromosome 
at which the gene of interest resides. If the second allele contains an 
inactivating mutation LOH, results in a cell lacking functional copies 
of both alleles. LOH can also occur in sporadic tumors, indicating that 
loss of tumor-suppressor genes can underly oncogenesis in nonherita- 
ble as well as heritable cancers. 


III. Mapping and Cloning Genes Involved in Oncogenesis 

When a cancer predisposition syndrome is determined to be heredi¬ 
tary, the underlying gene can be mapped in the genome and cloned 
based on the mapping data. Proto-oncogenes can be mapped and 
cloned by virtue of their homology to viral oncogenes. Gene mapping 
techniques can be either physical or genetic. Physical mapping tech¬ 
niques such as in situ hybridization or somatic cell hybridization are 
used to roughly position genes for which at least a partial DNA se¬ 
quence is available, on a particular chromosome or chromosomal re¬ 
gion. Genetic mapping is used to determine the proximity of two DNA 
markers and is based on a statistical approach called linkage analysis. 
The markers used in genetic analysis can be DNA based or can be a 
phenotype, such as inherited cancer predisposition syndromes. Ap¬ 
proaches to gene cloning vary depending on whether a DNA probe for 
the gene exists and on the nature of the mapping data available. 

A. Physical Mapping: Somatic Cell Hybridization, 
Fluorescence in situ Hybridization, and 
Comparative Genome Hybridization 

Physical mapping techniques can be used to roughly position a gene 
within the genome when a probe, such as a viral oncogene, is available. 
Somatic cell hybridization assigns a gene to a given chromosome by 
following its segregation in a panel of interspecies hybrid cell lines. For 
example, human-hamster cell hybridization produces several distinct 
cell lines with predominantly hamster chromosomes and one or two 
human chromosomes. Human chromosomal assignments are made by 
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determining which cell lines contain the given human gene or its 
marker and correlating this information with the human chromosome 
content of the cell line. 

In situ hybridization and fluorescence in situ hybridization (FISH) 
are more precise localization techniques used to map a DNA fragment 
to a particular region or band within a chromosome (Harper and Saun¬ 
ders, 1981) Standard in situ hybridization uses isotopically labeled 
probes, whereas FISH uses fluorescently labeled probes. FISH is cur¬ 
rently the more widely used of the two techniques because it is more 
sensitive. Fluorescently labeled probes are applied directly to inter¬ 
phase chromosomes and allowed to hybridize. Regions of positive hy¬ 
bridization are visualized with fluorescent microscopy. The term chro¬ 
mosome painting describes the detection of large regions of homology 
by FISH (Breneman et al., 1995). Mapping by comparative genome 
hybridization (CGH) using whole chromosome probes for chromosome 
painting has revealed that very few major changes have occurred in 
the genome organization of mammals during evolution. In addition to 
determining the degree of synteny between the chromosomes of dis¬ 
tantly related species, CGH can be used to identify the regions of 
difference, or chromosomal imbalances, between individuals of the 
same species. This application is quite useful as a diagnostic tool in the 
study of human cancers and other diseases resulting from chromosom¬ 
al rearrangements (Piper et al., 1995; Thompson and Gray, 1993). 

B, Genetic Mapping and Linkage Analysis 

Genetic mapping is used to position a gene conferring a particular 
trait to a specific region of the genome by finding correlations between 
the occurrence of the trait and an identifiable DNA fragment that has 
been mapped. Cloning of the gene responsible for the trait under study 
does not require that the mapped DNA fragment be the gene but merely 
reside near it. Genetic markers can be any discernible genetic pheno¬ 
type, such as inherited cancer predisposition syndromes, or identifi¬ 
able DNA fragment, such as chromosomal rearrangements, restriction 
fragment length polymorphisms (RFLPs), microsatellite sequences, 
known genes, or anonymous, unique sequences. The degree of linkage 
between markers is determined by the recombination frequency or the 
genetic distance between them. Recombination frequency is the per¬ 
centage of meioses in which a recombination is detected between two 
markers. Genetic distance is measured in centimorgans (cM); if recom¬ 
bination between two markers is observed in one of 100 meioses then 
they are separated by 1 cM, which in the human genome corresponds 
to 10 6 base pairs (bp) of DNA. In the case of chromosomal rearrange- 
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ments with a strong genetic linkage to cancer predisposition syn¬ 
dromes, the gene affected by the rearrangement is frequently involved 
in oncogenesis and can be cloned on the basis of the correlation with 
the rearrangement. RFLP analysis has been used to map the genes for 
cystic fibrosis, neurofibromatosis type I (NFI), multiple polyposis coli, 
Li Fraumini syndrome, muscular dystrophy, retinoblastoma, colon 
cancers, and small-cell lung cancers (Rommens, 1995; Wallace et al., 
1990; Dunlop et al., 1990; Kunkel et al., 1989; Lee et al., 1987; Yokotac^ 
al, 1987). 

C. Cloning Disease Genes: Screening for Oncogene Homologs, 

Candidate Gene Cloning, Chromosome Walking, Positional 
Cloning, and Representational Difference Analysis 

If a DNA fragment is available, such as an oncogene carried by a 
retrovirus, the homologous proto-oncogene can be cloned from a cDNA 
or genomic library by hybridization screening techniques using the 
oncogene as a probe (Sambrook et al., 1989). Once the gene has been 
cloned, its regulatory regions and the structure and function of the 
encoded protein can be studied. 

No prior knowledge of the position of the gene within the genome is 
necessary to clone a proto-oncogene corresponding to a known on¬ 
cogene. However, once the gene is cloned, it can be mapped and used as 
a marker in genetic linkage studies to help identify the position of 
other genes. A mapped proto-oncogene may become a candidate gene 
in the search for the cause of a particular disease if genetic mapping of 
that disease assigns the responsible gene to that same region of a 
chromosome. The strength of the candidacy is increased if the respon¬ 
sible gene is predicted to have a structure or function related to the 
newly mapped gene. Identifying genes responsible for a disease by 
examining all the genes known to map to the same position in the 
genome as the unknown gene conferring the disease is called candidate 
gene cloning. The sequence of the candidate gene from individuals with 
and without the disease is compared to determine whether mutations 
within this gene are responsible for the disease phenotype. About 75% 
of the approximately 40 cloned mouse mutations were cloned in this 
way (Copeland et al. } 1993). 

Once the position of a disease gene has been genetically mapped, the 
gene can be cloned using chromosome walking, positional cloning, or 
the candidate gene method (just described). Cloning genes by chromo¬ 
some walking requires that a DNA fragment that is tightly linked to 
the gene of interest is available as a probe. In this technique, probes 
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derived from the end of one clone are used to rescreen the library to 
obtain recombinants that contain adjacent chromosomal DNA se¬ 
quences (Watson et al. , 1983). Large segments of the mouse and human 
major histocompatibility (MHC) locus have been isolated in this man¬ 
ner (Steinmetz et al., 1981). Clones with larger inserts, such as yeast 
artificial chromosomes (YACs), can also be used. 

For positional cloning, a fine-structure linkage map must be con¬ 
structed using markers close to and flanking the gene of interest (Col¬ 
lins, 1992). Animals showing recombination between these markers 
are analyzed to refine the map position down to the approximate insert 
size in YAC libraries. Restriction mapping is used to confirm the physi¬ 
cal distance between the markers. YAC libraries are then screened 
using the known markers as probes, and positive, overlapping YACs 
are ordered into contiguous units. The YACs are screened for the gene 
of interest by expressing the genes contained in the YAC in animals 
bearing the disease. If a particular YAC can reverse or “rescue” the 
disease phenotype, then the precise position of the exons within the 
YAC are identified by a technique called exon trapping (Buckler et ai , 
1991). The trapped exons can then be used as probes to recover the 
corresponding copy DNA (cDNA), which is then used to identify the 
entire gene. This technique has been used to clone several mutations in 
the mouse, including Brachyury, Snell’s Waltzer, agouti, Beg, and short 
ear (Herrmann et al., 1990; Avraham et al., 1995; Bultman et al., 1992; 
Vidal et al., 1993; Kingsley et al., 1992). 

Representational difference analysis (RDA) allows the regions of dif¬ 
ference between genomes identified by CGH to be recovered and cloned 
(Lisitsyn et al., 1993). Subtractive hybridization is used to enrich for 
DNA fragments uniquely represented in one source but not another, 
closely related source (Sambrook et al., 1989). This technique has been 
used to clone proviruses from the human genome and regions of chro¬ 
mosomal rearrangement and deletion that can occur during normal 
cellular differentiation or oncogenesis. CGH has been used to reveal 
that a herpesvirus is linked to Kaposi’s sarcoma in both HIV and non- 
HIV infected patients (Schalling et al., 1995). 


IV, Animal Model Systems Used to Study Gene Function 

As discussed previously, the study of the transforming retroviruses 
associated with a wide variety of animal tumor systems has led to the 
discovery of many oncogenes and proto-oncogenes (see Table 1). Addi¬ 
tionally, mutations arising in the fruit fly Drosophila melanogaster 
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and the worm Caenorhabidis elegans following mutagenesis that con¬ 
fer novel phenotypes have provided a rich source of genes involved in 
development. Frequently, the genes underlying these phenotypes also 
play important roles in mammalian development and oncogenesis. For 
example, the human homologs of many transcription factors involved 
in drosophila development, segmentation, and differentiation have 
been found to play a role in leukemogenesis , presumably because he¬ 
matopoietic cells require a delicate balance of transcription factor ex¬ 
pression during hematopoietic differentiation (Look, 1995). Gene func¬ 
tion in the control of the cell cycle and in embryogenesis can also be 
studied by microinjection of gene transcripts into the oocytes of the 
African clawed frog, Xenopus laevis (Matten and Vande Woude, 1995). 
Transgenic mice are frequently used to study the effects of overex¬ 
pressing a wild-type or mutant gene or the effects of eliminating a gene 
altogether in a mammalian system. Currently, transgenic mice and 
knock-out (i.e., null mutation) mice provide powerful new tools for 
elucidating both the normal cellular function and the oncogenic mecha¬ 
nisms of transforming genes as well as providing methods for testing 
gene therapy approaches to cancer. 

Transgenic refers to organisms that carry genetically engineered 
changes in their germ lines. Currently, the mouse is the most widely 
used animal in transgenic studies (Hughes, 1991). The mouse genome 
can be modified either by introducing genes at random positions or by 
homologous recombination at a specific locus either by directly inject¬ 
ing DNA into the pronucleus of a mouse embryo or by transfecting 
DNA for homologous recombination into embryonic stem cells. The 
mouse model systems provide an excellent tool for studying the func¬ 
tion of a wide range of genes, including oncogenes. Transgenic technol¬ 
ogy can be used to eliminate or alter the expression pattern of a gene or 
to express an altered version of the gene. A large number of strains of 
transgenic mice have been generated in the study of oncogene function 
using these techniques, many of which will be discussed in subsequent 
sections. 


A. Transgenic Mice 

When DNA is injected directly into the pronucleus of a mouse em¬ 
bryo, the sites of DNA integration are random; thus, DNA from any 
source can be used because homologous recombination is not required 
(Hogan et ai, 1994). The offspring resulting from the microinjected 
embryos are analyzed to determine whether they carry the transgene 
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in the germ line. Such transgenic mice can be used to study the func¬ 
tions of both regulatory and coding regions of oncogenes. 

By creating a transgene, which uses the transcriptional regulatory 
region of a gene to drive expression of a reporter gene, the ability of the 
regulatory elements to direct transcription of the reporter gene in vari¬ 
ous cell types during development and in the adult can be studied. For 
example, although the proto-oncogene writ -1 was known to regulate 
dorsal-ventral patterning of the brain, a transgene study found that 
the regulatory elements of the gene governing the wnt-1 expression 
pattern reside in the 3' untranslated region (Echelard et al. t 1994). The 
cellular proteins that bind to and regulate expression from these regu¬ 
latory sequences can now be studied. 

Other transgenes are typically constructed from a known transcrip¬ 
tional promoter driving expression of the proto-oncogene under study. 
Transgenic mouse strains have been developed that consistently devel¬ 
op certain types of tumors in response to expression of a specific on¬ 
cogene. Expression of the transgene can be driven by either a constitu¬ 
tive or a tissue-specific promoter, depending on the range of expression 
desired. Constitutive promoters are useful for determining those tis¬ 
sues that are most easily transformed by a particular oncogene. For 
example, constitutive expression of c -fos results in osteosarcomas 
(Johnson et al., 1992). Tissue-specific promoters can be used to study 
the effects of a highly transforming oncogene, such as c -myc, on a 
limited range of cell types: Mammary adenocarcinomas result when 
Q-myc expression is driven by a mouse mammary tumor virus promo¬ 
ter, whereas B-cell lymphomas arise when an immunoglobulin promo¬ 
ter is used (Morgenbesser and DePinho, 1994). 

B. Gene Knockouts 

Homologous recombination in cultured, pluripotent embryonic stem 
(ES) cells can be used to make a cell line bearing a precise gene re¬ 
placement (Capecchi, 1989). In this procedure, a specifically designed 
DNA construct containing two homology regions flanking a selectable 
drug resistance marker is transfected into ES cells. The two homology 
regions direct recombination and the drug-resistance marker allows 
selection for cells stably maintaining the construct. After drug selec¬ 
tion and analysis of the resultant clones for successful homologous 
recombination, altered ES cells are injected into 16- or 32-cell stage 
blastocysts to produce mice chimeric for wild-type and transgenic cells. 
Some of the chimeric animals will have germ line cells derived from the 
altered ES cells. When these animals are crossed with a wild-type 
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mouse, they will produce nonchimeric animals heterozygous for one 
wild-type and one recombinant allele. The heterozygous animals can 
then be used to generate animals homozygous for the recombinant 
allele. Usually, the goal of gene replacement studies is either to make a 
null mutation (gene knockout) via targeted disruption of the gene or to 
modify the gene by inserting a cassette containing a mutation found in 
a human gene (Galli-Taliadoros et al., 1995). 

Many null alleles of proto-oncogenes are embryonic lethal when ho¬ 
mozygous: c-myc knock-out mice die because of faulty yolk sac circula¬ 
tion, NF-1, N-rayc, WT-1, and RXRa null mutants die because of heart 
defects, while Rb, c -myb, c-jun, met, and HGF null mutants die because 
of failure of liver development or hematopoiesis (Stanton et al ., 1992; 
Jacks et al ., 1994; Kreidberg et al. 9 1993; Sucov et aL, 1994; Lee et aL, 
1992; Hilberg et aL, 1993; Bladt et aL, 1995; Schmidt et al., 1995). 
Unfortunately, embryonic lethal knockouts prevent the researcher 
from studying all roles of the gene under investigation because the 
animals die at the first developmental stage in which the gene product 
is absolutely required. Furthermore, it is difficult to determine wheth¬ 
er the lethal event is a direct result of the null mutation on the affected 
organ or whether the null mutation indirectly interferes with the de¬ 
velopment of the affected organ by disrupting the microenvironment 
supporting its development. Additionally, because organs are com¬ 
posed of a variety of cell types, determination of the cell type responsi¬ 
ble for the failure of organ development is difficult when all of the cells 
in the animal are homozygous for the null allele. 

ES cells bearing two copies of the knock-out allele can be used to gain 
additional information about the role of a gene during development 
when the null allele is embryonic lethal in the homozygous condition 
(Hilberg et al., 1993). Homozygous knock-out ES cells can be made 
using the heterozygous knock-out ES cells as the recipient cells for an 
additional round of homologous recombination using a similar con¬ 
struct bearing a different drug resistance marker to target the remain¬ 
ing wild-type allele. These embryos are then implanted into the uterus 
of a pseudopregnant mouse. Chimeric animals composed of wild-type 
and homozygous knock-out cells will survive only if the double knock¬ 
out cells do not contribute to tissues in which the gene product is 
required. Thus, scoring of tissue contribution of the knock-out cells in 
the viable progeny will reveal the tissues in which the gene product is 
required. Scoring of tissues is facilitated by marking the knock-out 
cells with an easily detectable reporter gene. 

Systems for generating conditional knockouts have been or are cur¬ 
rently being developed that will allow null and other homologous re- 
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eombinant alleles to be targeted to a specific tissue. Such conditional 
knockouts will facilitate the study of null alleles, which are embryonic 
lethal when homozygous; because the gene of interest is knocked out 
only in a particular tissue, other specific aspects of the gene's function 
besides the first lethal event can be studied. The first successful report 
of a conditional knockout used the so-called Cre-loxP system based on 
bacteriophage PI to knock out the DNA polymerase (3 gene in T cells 
(Gu et aL, 1994). A similar system, which uses the FLP enzyme from 
yeast, also is being developed (Barinaga, 1994). As of 1997, no condi¬ 
tional proto-oncogene knockouts have been reported. 


V. Identity and Function of Genes Involved in Oncogenesis 

In this article, we discuss four functionally defined categories of 
genes involved in oncogenesis. As noted earlier, oncogenes are domi¬ 
nantly acting tumorigenic versions of normal cellular genes involved in 
signal transduction. In contrast, tumor-suppressor genes normally 
function to suppress cellular growth and thus loss or inactivation of 
both copies is required for transformation (Fearon, 1995). Tumor sup¬ 
pressor genes encode either transcription factors or cytoplasmic and 
cytoskeletal-associated proteins involved in signal transduction. Loss 
or inactivation of tumor-suppressor genes is largely responsible for 
inherited cancer predisposition syndromes, although one oncogene, ret , 
has recently been identified as the cause of some cases of inherited 
multiple endocrine neoplasia (Van Heyningen, 1994). Inactivation of 
DNA repair enzymes is indirectly involved in oncogenesis because loss 
of both copies of some DNA repair genes leads to cancer by increasing 
the number of mutations that escape detection and repair in the proto¬ 
oncogenes and tumor-suppresser genes. Another broad class of genes, 
referred to here as metastasis-associated genes, also are not directly 
tumorigenic but are involved in cancer because they render tumor cells 
competent to migrate to, invade, and colonize a secondary site. 

A. Oncogenes and Proto-oncogenes in Signal Transduction 

The dominant transforming genes are collectively called oncogenes; 
their normal cellular counterparts are proto-oncogenes. Oncogene-en¬ 
coded proteins superimpose their activity on the cell to elicit the trans¬ 
formed phenotype and are thus dominant genes considered to be posi¬ 
tive regulators of growth. To the transformed cell, oncogene expression 
results in a gain of function. Frequently, oncogenes encode altered 
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versions of the normal cellular proteins encoded by their cognate proto¬ 
oncogenes and are created by chromosomal rearrangement, by the ac¬ 
tion of transforming retroviruses, or by point mutations affecting spe¬ 
cific domains of the protein. Sometimes, a normal cellular protein is 
transforming when over- or aberrantly expressed because of alter¬ 
ations in the gene’s transcriptional promoter or by amplification of 
gene copy number. Although the term proto-oncogene tends to suggest 
that proto-oncogenes reside in the genome for the sole purpose of ex¬ 
pressing the neoplastic phenotype, they in fact encode proteins essen¬ 
tial to the normal biological processes by which cells respond to extra¬ 
cellular and intracellular signals to grow, divide, and differentiate in a 
regulated fashion. Proto-oncogenes have been highly conserved during 
evolution and it is therefore possible to clone the proto-oncogene homo- 
logues from genetically well-characterized and easily manipulated or¬ 
ganisms such as fruit flies, worms, frogs, and mice for further study 
(Shilo and Weinberg, 1981). Thus, the phenotypic influence of mutated 
proto-oncogene homologs on cell division, differentiation, and embry- 
ological development can be tested, and genes can be identified that 
suppress the mutant phenotype, thereby identifying other members of 
the biochemical pathway 

The coordinated growth, division, and differentiation that are essen¬ 
tial for viable multicellular organisms are mediated through complex 
pathways that propagate and amplify the signal from outside the cell 
to specific targets within the cell (e.g., the cytoskeleton and the nu¬ 
cleus). Because growth is restricted to a subset of cell types within 
most organs of mature multicellular organisms, these pathways are 
highly regulated. The chain of events allowing a cell to respond to a 
growth or differentiation signal is called signal transduction. Each 
control point along the pathway can be the target of deregulation by 
oncoproteins, with subsequent overexpression or ectopic expression of 
normal or mutated proteins. Regardless of the exact structure of the 
oncoprotein, loss of regulated signaling by oncogene expression can 
force a cell into uncontrolled cell division, or invasive growth. Hence, 
the classes of proteins involved in signal transduction define the 
classes of oncogenes (see Table 1). 

External growth factors initiate the signaling cascade by binding to 
their cognate receptors on the surface of target cells. The ligands for 
growth factor receptors can be either soluble or present on other cells 
or in the extracellular matrix. Growth factors must be supplied to 
untransformed cells in culture to stimulate cell proliferation. Trans¬ 
formed cells show partial or complete relaxation of the requirements 
for growth factors when oncogenes override growth factor dependency 
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by mimicking the action of ligands, their receptors, or downstream 
signals (Rijsewijk et al., 1987). Some oncogene products have been 
determined to function as extracellular growth factors when expressed 
in normal cells, including the product of the v-sis oncogene (Bishop, 
1991). Once expressed in nonphysiologic quantities or in aberrant tis¬ 
sues, uncontrolled cell proliferation results, leading to tumorigenesis. 

A large number of tyrosine kinase growth factor receptors has 
emerged from studies of oncogenes. To this sizable number can be 
added other tyrosine kinase receptors cloned on the basis of homology 
or their role in fruit fly and worm development (Hanks, 1991). These 
receptors contain a number of common structural features that are 
important to their function (Cantley et al., 1991). All have large, gly¬ 
cosylated, extracellular ligand-binding domains. Many function as ho¬ 
modimers or as heterodimers of a and b chains. The cytoplasmic por¬ 
tion of cell surface receptors contain tyrosine kinase domains that are 
required for signal transduction. Phosphorylation sites vary among the 
proteins and determine phosphorylation function. These sites modu¬ 
late receptor activity and provide recognition sites for receptor targets. 
Following the binding of ligand to the extracellular domain, these re¬ 
ceptors undergo dimerization and activation of tyrosine kinase activity. 
This leads to autophosphorylation as well as binding and phosphoryla¬ 
tion of other intracellular target proteins. Both N-terminal and C-ter- 
minal rearrangements appear to activate the transforming potential of 
these transmembrane receptor tyrosine kinase family members. These 
alterations may remove downmodulating domains of the protein and 
result in the constitutive activation of what is normally a conditionally 
regulated enzyme activity. Tyrosine kinase growth factor receptor pro¬ 
to-oncogenes include c-fms, c-erbB, c -ros, e-kit, met, and ret (see Table 
1; Hanks, 1991). 

Ligand binding induces a conformational change in the cytoplasmic 
domain of the receptor and, in the case of tyrosine kinase growth factor 
receptors, this results in receptor clustering and activation of receptor 
function by autophosphorylation (Cantley et al., 1991). The lipid bilay¬ 
er of the plasma membrane allows for lateral movement, dimerization 
or patching of receptors, and close interaction between the cytoplasmic 
tails of adjacent receptors (for crossphosphorylation) as well as be¬ 
tween the receptors and a number of proteins localized to the inner 
surface of the plasma membrane. 

The activated receptor serves as a nexus for the assembly of a multi¬ 
protein complex of signal transducers, the exact composition of which 
depends on the cell and signal type. Many of the proteins involved in 
these complexes have been implicated in oncogenesis, including the 



76 


CORTNER, VANDE WOUDE, AND VANDE WOUDE 


membrane-associated tyrosine kinases encoded by the abl y src, fgr, yes , 
and fps genes, and the product of the H-ras gene, p21ras. Other pro¬ 
teins that complex with activated receptors at the cytoplasmic side of 
the cell membrane during signal transduction include phosphatidyl- 
inositol-3 (PI3) kinase, the GTPase activating protein of p21ras (ms- 
GAP), protein kinase C (PKC), and phospholipase Cg (PLCg). Some of 
the signaling proteins are localized at the plasma membrane by struc¬ 
tural domains, such as basic motifs and posttranslational modifica¬ 
tions, including myristylation, farnesylation, and palmitoylation, while 
others, including ras-GAP, PI3 kinase, PKC, and PLCg, are recruited 
to the plasma membrane within seconds following receptor activation 
(Cantley et al ., 1991; Aaronson, 1991). Cytoplasmic proteins, such as 
the serine-threonine kinases encoded by the raf and mos proto-on- 
cogenes are also involved in the signal transduction cascade. 

The plasma membrane itself appears to be integrally involved in the 
generation of molecules that propagate the signal by serving as a pool 
of substrates for several key enzymes within the complex. These en¬ 
zymes include PLOy, which catalyzes the breakdown of phosphatidyl- 
inositol phosphates to generate diacylglycerol (DAG); phospholipase D, 
which converts phosphatidylcholine to a precursor of DAG; and PI3 
kinase, which causes the generation of phosphatidylinositol deriva¬ 
tives. These seemingly simple signaling molecules have diverse effects 
on the cell including activation of PKC by DAG, release of calcium from 
intracellular stores of inositol triphosphates, translocation of certain 
enzymes to the plasma membrane, and regulation of cytoskeletal pro¬ 
teins by polyphospho-inositides, among many other responses (Ma- 
jeras et ai, 1990). 

The endpoints of signal transduction are usually either the cyto- 
skeleton, which is the key effector of changes in cell shape and motility, 
or the nucleus, where regulation of cell cycle and gene expression takes 
place. Although much progress has been made in elucidating compo¬ 
nents of signal transduction pathways, significant gaps exist in our 
understanding, particularly in bridging the events at the cell mem¬ 
brane with events in the nucleus. Several of the nuclear proteins en¬ 
coded by proto-oncogenes and oncogenes are site-specific DNA binding 
proteins, which act as transcription factors to directly regulate gene 
expression, ultimately leading to control of cell proliferation, division, 
and differentiation. Nuclear oncogenes, which encode transcription 
factors, include myc, myb, ets , fos,jun, erbA , ets, NFkB, ski, and many 
others (Lewin, 1994). The balance of specific transcription factors ex¬ 
pressed within a cell is believed to be a primary way that signal trans¬ 
duction specificity emerges out of a set of pathways that overlap con- 
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siderably from one system to another. Disruption of this delicate bal¬ 
ance by expression of oncogene-encoded transcription factors is trans¬ 
forming by induction of expression of inappropriate downstream genes. 

Not all proto-oncogenes function in signal transduction. For exam¬ 
ple, the bcl-2 proto-oncogene, identified by a chromosomal breakpoint 
which activates bcl-2 in some B cell lymphomas, prevents apoptosis of 
progenitor cells in which it is expressed by preventing oxidative dam¬ 
age to certain cellular components (Korsmeyer et al., 1993). Lymphoid 
cells of transgenics overexpressing bcl-2 fail to undergo apoptosis and 
develop into lymphomas (Cory et al., 1994). Conversely, bcl-2 knock¬ 
out animals undergo lymphoid apoptosis of inappropriate precursor 
cells in the spleen and thymus, confirming the essential role of bcl-2 in 
apoptosis (Hockenbery, 1994). 


B. Tumor-Suppressor Genes 

In contrast to the oncogenes, tumor-suppressor genes normally func¬ 
tion to suppress cellular growth and thus act in a dominant fashion to 
their transforming alleles, which themselves are recessive because 
they are inactivated or deleted. Germ line heterozygosity for a wild- 
type allele and an inactivated allele of a tumor-suppressor gene is 
commonly the underlying cause of cancer predisposition syndromes. 
Tumors arise in such heterozygous individuals upon loss of the remain¬ 
ing wild-type allele. Frequently, inherited cancer predisposition syn¬ 
dromes present as autosomal dominant diseases caused by the high 
frequency of loss of the remaining wild-type allele in individuals with 
germ line heterozygosity for a mutant allele (Weigel, 1995). Tumor- 
suppressor genes have been identified as the cellular targets of the 
transforming proteins of DNA tumor virus and as genes whose func¬ 
tions are lost in certain cancer predisposition syndromes. Known 
tumor-suppressor genes encode either transcription factors or proteins 
associated with either cytoplasmic or cytoskeletal-associated proteins 
(see Table 3). 


1. RBI 

Cytogenetic studies of individuals in families transmitting familial 
retinoblastoma have led to the development of the model that invokes 
the loss of a tumor-suppressor gene as the basis for predisposition to 
retinoblastoma and other heritable cancer predisposition syndromes. 
Retinoblastoma accounts for 1% of all cancer deaths among children 
and is the most common malignant eye tumor. It can be multifocal and 



78 


CORTNER, VANDE WOUDE, AND VANDE WOUDE 


bilateral, and 40% of the cases are familial. Karyotypic analysis of 
tumor tissue and nontumor tissue revealed that about 5% of individu¬ 
als with retinoblastoma carried deletions in the region of chromosome 
13ql4 in tumor cells, suggesting that tumors arose in these individuals 
when the second copy of the responsible gene was lost as a result of the 
deletion. The remaining copy was assumed to carry an undetected 
deletion. 

Unambiguous support came with the demonstration that DNA 
probes to the chromosome 13ql4 region failed to detect this DNA in 
tumors, while DNA from nontumor tissue from the same patient was 
found to contain a single copy of DNA from the 13ql4 region (Cavenee 
et al., 1983). These investigations revealed that there was reduction to 
homozygosity (or loss of heterozygosity) in the tumor DNA for genes in 
the vicinity of chromosome 13ql4, caused by nondisjunction, mitotic 
recombination, or gene conversion, leaving the tumor cells with a sin¬ 
gle, presumably damaged, copy of the RBI gene. Cavenee and cowork¬ 
ers (1983) later showed that it was the allele from the unaffected par¬ 
ent that was lost and the allele from the affected parent that remained 
in somatic tissue during tumor development, consistent with the loss of 
a tumor-suppressor gene. 

Equipped with DNA probes for the region of chromosome 13 and 
RNA extracted from retinoblastomas, Friend and co-workers were able 
to isolate the gene that confers susceptibility to retinoblastoma (Friend 
et al., 1988). The gene (RBI) encodes a protein designated pl05-RB. 
That RBI encoded a tumor-suppressor gene was demonstrated by the 
reversion of transformed cells cultured from tumors to a non- 
transformed phenotype following reintroduction of the normal RBI 
gene, a finding that may ultimately have implications for cancer treat¬ 
ment (Bookstein et al., 1990). By RNA analysis, RBI gene was found to 
be expressed in most tissues, which was surprising because in familial 
retinoblastoma only two types of tumors develop: retinoblastomas and, 
after long latent periods, osteosarcomas (Lee et al., 1987). It is likely 
that other tumor-suppressor genes are expressed in other tissues in 
which RBI is expressed, rendering RBI expression redundant and 
thus protecting those tissues from RBI-mediated tumorigenesis. In 
addition to retinoblastomas and osteosarcomas, loss of RB function is 
associated at variable frequency with some sporadic tumor types, in¬ 
cluding sarcomas and carcinomas of the lung, breast, and prostate. 
Oncogenesis leading to these tumors is thought to involve inactivation 
of additional tumor-suppressor genes because these tissues are un¬ 
affected in individuals carrying familial RB predisposition genes. 

pl05-RB is a nuclear protein with properties of a cell cycle regulator 
that exists in phosphorylated and hypophosphorylated forms (De Cap- 
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rio et al., 1988). pl05-RB is hypophosphorylated in GO and imme¬ 
diately after mitosis before G1 and becomes phosphorylated upon stim¬ 
ulation with mitogen. Phosphorylation of pl05-RB is highest at the 
start of S phase (Buchkovick et al., 1989, Chen et al., 1989). Hypo¬ 
phosphorylated pl05-RB complexes with the transcription factor E2F, 
blocking its ability to transcribe S phase-specific genes at times after S 
phase. Late in Gl, phosphorylated pl05-RB cannot bind E2F, which is 
then free to promote transcription of S phase-specific genes involved in 
DNA synthesis. When complexed with hypophosphorylated pl05-RB, 
DNA viral oncoproteins SV40 large T antigen and HPV E7 inactivate 
pl05-RB and prevent it from interacting with E2F and suppressing S 
phase-specific transcription and thus from suppressing proliferation 
(Nevins, 1992). 

Given that RB is associated with a broad spectrum of tumors and is 
targeted by the transforming proteins of the DNA tumor viruses, it 
was believed that RB must play a critical role in controlling cellular 
proliferation during development and that animals homozygous for the 
null RB allele would die very early in embryogenesis because of a broad 
range of defects resulting from the loss of cellular proliferation control. 
Surprisingly, RB knock-out mice are viable until the fetal period when 
they die from defects in hematopoiesis (Lee et al., 1992). RB knock-out 
mice also have defective peripheral and central nervous systems; both 
the hematopoiesis and nervous system defects are thought to result 
from the failure of the cells to cease proliferation at a specified point 
during differentiation. Thus, as expected, the RB knock-out phenotype 
is embryonic lethal because of defects in cellular proliferation control. 
However, the relatively limited effects of the RB mutation were not 
expected and provide independent evidence that other tumor-sup¬ 
pressor genes must function in the unaffected tissues to control prolif¬ 
eration, This hypothesis is supported by the fact that only two types of 
tumors routinely occur in individuals with familial retinoblastoma. 
Interestingly, mice heterozygous for the null RB allele do not develop 
retinoblastoma, as do humans heterozygous for a null RB allele, but 
instead develop tumors of the pituitary when the second RB allele is 
lost in somatic cells. This unexpected phenotype indicates that other 
tumor-suppressor genes may function in the murine retina to compen¬ 
sate for the loss of RB and may point to a genetic treatment of reti¬ 
noblastoma in humans (Matzuk and Bradley, 1994). 

2. p53 

As with RB-1, p53 was found to form a tight complex with the trans¬ 
forming proteins of the DNA tumor viruses, such as SV40-encoded 
large T antigen, which suggested that inactivation of p53 might be 
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involved in oncogenesis. Subsequently, alterations at the p53 locus on 
chromosome 17p were found in a large percentage and wide variety of 
human tumors and is currently the most commonly altered gene asso¬ 
ciated with human cancers (Levine et al., 1991b; Hollstein et al ., 1991). 
Indeed, about 75% of colon tumors show abnormalities at both p53 
alleles: One allele is often deleted, while the other has point mutations, 
which are usually missense mutations that yield an altered protein 
product. Interestingly, inactivation of p53 in most colorectal tumors is 
sporadic, resulting from somatic mutations. Individuals with the can¬ 
cer-predisposing Li-Fraumeni syndrome are born with mutations in 
one allele of the p53 gene and develop tumors that bear mutations at 
both alleles, as predicted by Knudson’s hypothesis developed in the 
analysis of retinoblastoma (Malkin et al., 1990). As with RB-1, wild- 
type p53 can reverse the transformed phenotype of tumor cells that are 
transformed because of loss of both copies of p53. 

The point mutations found in the p53 gene are highly clustered at 
points that define its functional domains. Most mutations occur in one 
of three amino acids, 175, 248, and 273; other mutations are clustered 
within four short regions (Levine et al ., 1991a). Certain mutations are 
more common in certain tumor types, which may reflect that different 
cell types are sensitive to the loss of specific p53 functions. p53 is a 
nuclear phosphoprotein containing a transcription activation domain 
at its amino terminus, a site-specific DNA binding motif in its central 
region, and a tetramerization domain at its carboxy terminus. Inter¬ 
estingly, some transforming mutations in the tetramerization domain 
act in a dominant negative fashion and can thus inactivate wild-type 
p53 in heterozygous cells. Similarly, p53 can be inactivated in cells 
overexpressing the product of the MDM2 gene, which binds to and 
inactivates p53. When overexpressed, MDM2 acts as an oncoprotein by 
virtue of its binding to p53 (Momand et al., 1992). 

p53 is believed to execute its role as a tumor suppressor by blocking 
DNA replication under certain circumstances and by regulating tran¬ 
scription of particular genes. Wild-type p53 positively regulates tran¬ 
scription of the GADD45 and WAF1 genes, which encode proteins in¬ 
volved in DNA repair and cell cycle control, respectively. p53 is thought 
to function by channeling cells with damaged DNA into either G1 
arrest for repair of DNA damage or into apoptosis if damage is irrepar¬ 
able (Lee et al 1994). Cells that lack wild-type p53 fail to arrest in Gl 
for DNA repair prior to initiating replication and thus accumulate 
additional mutations at an accelerated rate. Given the role of p53 in 
cell cycle progression, p53 knock-out mice were expected to have great 
genomic instability and thus a heightened frequency of tumorigenicity 
if they survived embryonic development. Production of these animals 
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has shown that they are able to develop normally but have a much 
greater incidence of tumor development resulting from oncogene over¬ 
expression caused by genomic instability (Ronge£ a/., 1995; Donehower 
et al., 1995). The heterozygous animals also develop tumors at a great¬ 
er than normal rate; these animals are an excellent model for Li- 
Fraumeni tumor susceptibility syndrome, which, in humans, results 
from the p53 heterozygous condition (Sands et al ., 1994). 

Interestingly, although loss of p53 in colon tumors is quite common, 
neither p53-deficient mice nor Li-Fraumeni patients develop colon can¬ 
cer at a rate greater than that seen in the general population. This 
unexpected result may provide information concerning the order of 
acquisition of mutations and resultant phenotypes in colon cancer pro¬ 
gression (Sands et al., 1994). In the colon, cells originating in the crypt 
are ultimately sloughed off in the villi; thus, although the p53-deficient 
colon cells have greater genetic instability compared to normal colon 
cells, the colon is protected from tumor formation in the p53-deficient 
animals by rapid cellular turnover. Colon tumors usually develop from 
benign polyps because cells within the polyps do not turn over as rap¬ 
idly, thus allowing accumulation of cells bearing transforming muta¬ 
tions. 


3. WT-1 

Wilms' tumor is a nephroblastoma arising in young children in both 
sporadic and inherited modes. In both cases, tumors are caused by loss 
or inactivation of both alleles of WT-1 (Haber and Housman, 1992). 
WT-1 expression is restricted to the developing kidney, which accounts 
for the limited spectrum of tumors with which its mutant alleles are 
associated. WT-1 encodes a zinc finger transcription factor, which regu¬ 
lates expression through the same DNA binding site as those bound 
by the early growth response genes EGR1 and EGR2. Normally, WT-1 
can repress transcription of several growth-associated genes and up- 
regulate expression of other genes depending on the p53 background 
(Rauscher, 1993). Loss of the transcriptional repression of growth-in¬ 
ducing genes, including several growth factor receptors and some tran¬ 
scription factors, is believed to cause cellular transformation. 

4. BRCA1 

Over half of all cases of inherited premenopausal breast cancer, 
many inherited cases of ovarian cancer, and a significant fraction of 
sporadic breast cancers are caused by transmission of altered alleles of 
the BRCA1 gene (Miki et al., 1994). An additional gene linked to breast 
cancer but not to ovarian cancer, BRCA2, has been identified (Wooster 
et al., 1994). As with RB-1, p53, and WT-1, BRCA1 encodes a transcrip- 
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tion factor. As with all the tumor-suppressor genes encoding transcrip¬ 
tion factors, predisposition for cancers with which they are associated 
presents as an autosomal dominant trait in individuals with germ line 
heterozygosity for inactivated alleles because of the frequency of loss of 
the remaining wild-type allele. Whereas the cumulative lifetime risk 
for breast cancer in individuals with no BRCA1 mutation is about 10%, 
the cumulative risk for individuals heterozygous for a mutant BRCA1 
allele is approximately 90%. Thus, detecting individuals who have in¬ 
herited BRCA1 mutations would greatly enhance our ability to predict 
a high predisposition to breast cancer. 

5. Cytoskeletal Components of Cellular Junctions 

As with the tumor-suppressor genes encoding transcription factors, 
other tumor-suppressor genes are involved in signal transduction, al¬ 
beit at different points in the cascade. The genes responsible for type 1 
and type 2 neurofibromatosis, NF1 and NF2, function in the cytoplasm 
and in the cytoskeleton, respectively. Tumors associated with the neu¬ 
rofibromatosis syndromes occur in individuals with germ line muta¬ 
tions after somatic events inactivate the remaining wild-type allele 
(The et al., 1993). Although NF1 is widely expressed, inactivation of 
both copies of NF1 results in tumors restricted to tissues of neural 
crest origin, presumably because of the expression of functionally re¬ 
dundant proteins in other tissues. NF1 encodes neurofibromin, which 
functions as a GTPase activating protein that can inactivate the ras 
gene product by converting its bound GTP to GDP (Xu et al., 1994). 
Thus, loss of neurofibromin function is thought to be transforming by 
increasing the signal transduction activity NF2 encodes schwan- 
nomin, which is believed to link the cellular membrane with cytoskele¬ 
tal components. Loss of function of schwannomin is thought to lead to 
tumors of the central nervous system by altering the cellular response 
to cell-cell and cell-substrate signals in those cells (Trofatter et al., 
1993). 

One form of inherited predisposition to colorectal cancer, familial 
adenomatous polyposis (FAP), results from inactivating mutations in 
the adenomatous polyposis coli (APC) gene (Su et al., 1993). Addi¬ 
tionally, somatically acquired mutations in both APC alleles appear to 
be involved in a large fraction of sporadic colorectal tumors. APC en¬ 
codes a large protein that resides in the cytoplasm and binds to mem¬ 
bers of the catenin family, which are involved in adherins junctions of 
epithelial cells. Loss of function of APC is thought to be transforming 
by interfering with adhesion signals, leading to loss of contact inhibi¬ 
tion of proliferation. A mouse intestinal tumor model has been identi- 
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fied in which germ line inactivation of the mouse APC gene results in 
multiple intestinal neoplasias (MINs; Moser et al. t 1990). The MIN 
mice express different phenotypes depending on their genetic back¬ 
ground; study of the MIN mouse tumor model will thus provide addi¬ 
tional insight into the process of oncogenesis mediated by APC. 

C. DNA Repair Genes 

DNA repair enzymes are also involved, albeit indirectly, in on¬ 
cogenesis. Although no DNA repair genes are directly transforming, 
loss or inactivation of both copies of some DNA repair genes leads to 
cancer by increasing the rate of accumulation of mutations in the pro¬ 
to-oncogenes and tumor-suppressor genes. Interestingly, specific DNA 
repair genes are associated with specific tumor types. Although their 
role in oncogenesis is indirect, the impact of the transforming alleles of 
DNA repair genes on human health is significant, especially with re¬ 
gard to colorectal cancer. 

Inherited predisposition of colorectal cancer accounts for roughly 
15% of the approximately 150,000 cases of colorectal cancer reported 
each year (Hamilton, 1992). Two forms of inherited colorectal cancer 
have been identified: FAP (discussed previously) and hereditary non¬ 
polyposis colorectal cancer (HNPCC), which accounts for the majority 
of cases of familial colorectal cancer. The genes predisposing individu¬ 
als to HNPCC, MSH2, and MLH1, encode DNA mismatch repair en¬ 
zymes, which, when inactivated, predispose the cell to “stuttering” 
during replication of certain dinucleotide repeats (Thibodeau et al., 
1993). This stuttering effect results in genetic instability when extra 
copies of the dinucleotide repeats are inserted. Such insertions can 
cause inactivation of tumor-suppressor genes or activation of proto¬ 
oncogenes. HNPCC presents as an autosomal dominant disease be¬ 
cause of the high rate of mutation of the remaining wild-type allele in 
individuals carrying one mutant allele. 

Other hereditary cancers resulting from defects in DNA repair genes 
are recessive, presumably because of low rate of mutation of the re¬ 
maining wild-type allele in individuals carrying one mutant allele. 
Frequently, alterations of more than one gene can be causal; the vari¬ 
ous genes associated with each of the syndromes defines the genes 
involved in specific repair pathways. Defects in helicases and in the 
nucleotide excision repair pathway account for the skin tumors associ¬ 
ated with xeroderma pigmentosum resulting in heightened sensitivity 
to ultraviolet irradiation (Friedberg, 1992). Leukemia associated with 
Fanconi’s anemia results from loss of function of genes involved in the 
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repair of DNA cross links (Friedberg, 1992; Strathdee et aL, 1992). 
Lymphomas in individuals with ataxia telangiectasia arise because of 
defects in genes of the x-ray response pathway (Gatti et aL, 1991). 
Defects in a DNA ligase gene are believed to be responsible for tumors 
arising in individuals with Bloom syndrome (Heim et aL, 1992). 

D. Metastasis-Associated Genes 

Currently, tumor metastasis is the most challenging problem faced 
by oncologists in both the clinic and the laboratory. As a result of 
genetic instability, tumors are composed of a population of hetero¬ 
geneous cells that accumulate a range of genetic alterations in addition 
to the original transforming mutations (Fidler and Poste, 1985). With¬ 
in this heterogeneous population, some cells capable of metastasis can 
develop. Metastasis-associated genes encode a diverse array of pro¬ 
teins including adhesion molecules, cytoskeletal components, pro¬ 
teases, and regulatory molecules, which are not themselves transform¬ 
ing but which equip the transformed cell to metastasize. In some cases, 
mutation results in loss of function, for example, of adhesion mole¬ 
cules, resulting in increased cellular motility In other cases, expres¬ 
sion of metastasis-associated genes, such as proteases, in cells that do 
not normally express them produces cells capable of invasion. 

Fortunately, metastasis is a highly selective process, which very few 
tumor cells are capable of successfully completing (Fidler, 1995). For 
tumor cells to metastasize, the tumor must be well vascularized to 
support growth of the tumor beyond a few millimeters in diameter and 
to provide a route of metastasis. Consequently, a subset of the tumor 
cells must produce angiogenic factors capable of inducing endothelial 
cells to proliferate and form capillaries within the tumor. Furthermore, 
metastatic cells must be capable of detaching from the tumor mass and 
invading either the lymphatic or circulatory systems. These steps re¬ 
quire reduced cellular adhesion to the tumor cell mass and increased 
affinity for the endothelial cells of the blood vessels, as well as in¬ 
creased motility through extracellular matrix of the basement mem¬ 
branes. Once launched, the candidate metastatic cell must survive the 
mechanical shear forces of circulation, evade detection and clearing by 
the immune system, and adhere to a capillary wall at the secondary 
site. Adhesion to and escape through the capillary wall into the sur¬ 
rounding tissue again requires affinity for the endothelial cells as well 
as motility and the ability to invade through extracellular matrix of 
the basement membranes. Once the secondary site has been pene¬ 
trated, the successful metastatic cell must be able to proliferate; this 
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step requires that the cell’s proliferation not be adversely affected by 
factors present in the new microenvironment and that sufficient nutri¬ 
ents and growth factor are present to sustain the secondary tumor. 

The standard model for the development of the metastatic pheno¬ 
type holds that each of the characteristics necessary to support me¬ 
tastasis are acquired as the result of independent genetic alterations 
with in the transformed cell. Given that a tumor cell must initiate a 
gene expression program that correctly balances the production of an¬ 
giogenic factors, motility factors, cytoskeletal factors, proteases, and 
myriad regulatory molecules, as well as downregulate expression of 
key adhesion molecules, it is surprising that any transformed cell can 
do so by the accumulation of multiple independent genetic alterations. 
One can imagine that if such a gene expression program could be 
triggered with a single change, then metastasis would become a much 
more likely outcome. It is becoming clear that in the case of autocrine 
transformation by overexpression of the met proto-oncogene product 
and its ligand HGF/SF, activation of this single transforming gene 
product can indeed initiate many if not all of the ancillary cellular 
functions just listed that are necessary for expression of the metastatic 
phenotype. 

The c -met proto-oncogene encodes the tyrosine kinase growth factor 
receptor for the HGF/SF growth factor (Bottaro et al, 1991). HGF/SF 
is normally produced by mesenchymal cells and acts in a paracrine or 
endocrine fashion on rae£-expressing epithelial cells, although met is 
occasionally expressed in some populations of mesenchymal cells 
(Rosen et al., 1994). From gene knock-out studies it was learned that 
during embryonic development meGexpressing myogenic precursors 
originating in the somites migrate to the presumptive limb buds in 
response to HGF/SF production at that site (Bladt et al., 1995). Al¬ 
though the genes expressed during this process of invasion and migra¬ 
tion are just beginning to be identified, it is clear that many of the 
same gene products, when inappropriately expressed, could fulfill 
many of the requirements for metastasis. Besides transforming 
HGF/SF-expressing mesenchymal cells, met expression triggers the 
ability of the cells to degrade extracellular matrix and basement mem¬ 
brane by inducing the expression of proteases such as collagenase and 
the urokinase-type plasminogen activator (Rong et al., 1994; Jeffers et 
aL, 1996a; Jeffers et al, 1996b). Additionally, meGHGF/SF signaling 
enhances cellular motility and causes a reduction in cell-cell adhesion 
(Rubin et al, 1993). These effects, coupled with the mitogenic effect of 
met- HGF/SF co-expression and the ability of HGF/SF to promote an¬ 
giogenesis of endothelial cells, suggest that transformation by overex- 
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pression of the c -met oncogene may result in expression of a metastatic 
program of gene expression (Grant et al ., 1993). Co-expression of met 
and HGF/SF in mesenchymal cells is not only transforming but also 
produces metastases rapidly and at high frequency, supporting the 
idea that inappropriate expression of the me^-mediated embryonic mi¬ 
gration program can be used by the transformed cells to drive metasta¬ 
sis (Rong et al., 1992; Jeffers et al., 1996a). Furthermore, it is likely 
that this mechanism plays an important role in metastasis of human 
sarcomas because a high percentage of sarcoma cell lines and tumor 
samples express met inappropriately (Rong et al., 1993; Rong et al., 
1995; Cortner and Vande Woude, 1996). 


VI. Applications of Oncogene Research 

A. Screening for Cancer Predisposition 

A primary goal of oncogene research is the development of diagnostic 
assays to identify individuals at heightened risk for cancer. Although 
widespread screening of the general population for all of the known 
cancer genes is currently not feasible, targeted screening of certain 
groups for loss of or mutations in specific genes is feasible. For example, 
it is estimated that 1 in 200 women carries alleles of BRCA1 pre¬ 
disposing to either breast or ovarian cancer. Identifying that 0.5% of 
of the female population would dramatically reduce the number of cases 
of advanced breast cancer by identifying those individuals who should 
be screened early and frequently by mammogram for the occurrence of 
nonpalpable, nonmetastatic tumors (King et al., 1993). Such early de¬ 
tection and treatment would save money as well as lives. By focusing 
screening efforts only on individuals known to be at greater risk be¬ 
cause of higher rates of breast cancer among relatives, the majority of 
women predisposed to breast cancer because of BRCA1 mutations could 
be identified. 

Although screening for cancer genes has many merits, there are 
compelling reasons why, under current social conditions, even at-risk 
individuals might choose to forgo testing if it were made available. A 
major problem associated with making tests for cancer predisposition 
widely available is that they may jeopardize the ability of high-risk 
patients to obtain insurance (Ostrer et al., 1993). If tests are widely 
available, insurers or employers could conceivably require negative 
results as a condition for employment or coverage as a means of limit¬ 
ing lost productivity and excluding individuals likely to require expen- 
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sive medical procedures. Given these problems, it is clear that we need 
to establish guidelines for testing such that at-risk individuals are not 
penalized by loss of employment or insurance coverage if we are to 
benefit from genetic screening for cancer predisposition. 

Methods for cancer gene screening vary depending on the genes 
being assayed and the nature of the mutations that inactivate them. If 
an inherited gene is inactivated by deletion of a large chromosomal 
region, then cytogenetic studies can be used as a screen. If sub- 
microscopic deletions are responsible, then FISH could be used. Alter¬ 
natively, biochemical testing for reduced levels of protein or protein 
activity of the product encoded by the tumor-suppressor gene or an 
adjacent gene that is also deleted could be used. If the gene is inacti¬ 
vated by sequence amplification or by certain mutations that alter the 
restriction pattern of the genomic DNA, then the cancer-predisposing 
alleles can be detected by DNA hybridization analysis in conjunction 


with RFLP mapping. If the gene is inactivated by one or more point 
mutations that do not produce discernible RFLPs, then sequence anal¬ 
ysis is required to detect the mutation. Gene sequence can be deter¬ 
mined by biochemical or hybridization-based methods. If the gene is 
inactivated by very specific point mutations that recur in unrelated 
individuals, as is the case for p53, then those regions of the gene can be 
analyzed specifically. If the gene can be inactivated by a wide array of 
mutations, then screening by sequence analysis becomes more difficult 


because the entire gene may need to be sequenced. 


B. Human Gene Therapy 

A wide variety of somatic cell gene therapy protocols are currently 
under development in the laboratory or in use in the clinic that are 
designed to treat single-gene diseases, a variety of cancers, or HIV 
infection (Kohn et aL, 1989; Roth et al., 1994; Dropulic and Jeang, 

1994) . The goal of human somatic cell gene therapy is usually one of 
the following: to repair or compensate for a defective gene, to enhance 
the immune response directed against a tumor or pathogen, to protect 
vulnerable cell populations against treatments such as chemotherapy, 
to generate a marked population of cells for tracing the origins of 
recurrent tumors, or to kill tumor cells directly (Lee and Klein, 1995). 

Transmission of the recombinant DNA to target cells is accomplished 
either by ex vivo modification of the target cells followed by reintroduc¬ 
tion into the patient or by restricted in vivo delivery (Lee and Klein, 

1995) . Ex vivo modification is currently more widely used than in vivo 
therapy. For example, cells found in the bone marrow can be tar- 
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geted ex vivo because they are readily accessible, manipulatable in 
vitro, and contain stem cells to perpetuate the altered genotype. When 
in vivo delivery is used, the transgene is targeted to the appropriate 
cells by localized exposure, by use of a tissue-specific promoter to drive 
transgene expression, or by use of a viral vector that targets specific 
cell types (Salmons and Gunzburg, 1993). For example, liver can be 
targeted by using herpesvirus-based vectors, CD4+ lymphocytes can 
be targeted with HIV-based vectors, muscle cells can be targeted with 
tissue-specific promoters or by direct DNA injection, lung cells can be 
targeted for therapy in cystic fibrosis and lung cancer with inhaled 
adenoviral vectors (Dunbar and Emmons, 1994; Efstathiou and Min- 
son, 1995; Miller and Boyce, 1995; Trapnell and Gorziglia, 1994). 

Delivery vehicles can be viral or nonviral (Jolly, 1994; Ledley, 1995). 
Although viral vectors deliver recombinant DNA to a much higher 
percentage of cells than do nonviral delivery methods, nonviral meth¬ 
ods are useful when a small dose of the therapeutic gene will suffice. 
These methods include in vivo liposome delivery, direct injection of 
naked DNA, and transfection of target cells ex vivo (Lee and Klein, 
1995). If ex vivo transfection is used, the expressing cell population can 
be selected, thus increasing the percentage of cells expressing the ther¬ 
apeutic gene. 

Modified retroviruses are frequently used as vectors in human gene 
therapy because they recombine with the genome (Jolly, 1994). Unfor¬ 
tunately, recombination with the genome produces the risk of inser- 
tional mutagenesis. This risk is sometimes addressed by inserting a 
“suicide” gene into the vector in addition to the therapeutic so that cells 
adversely affected by random integration can be killed (Tiberghien, 
1995). Because the retroviruses on which most commonly used vectors 
are based cannot infect nondividing cells, these vectors are unsuitable 
for some uses. However, this feature has been exploited to deliver 
genes in vivo to brain tumors, where the tumor cells are the only 
dividing cells (Oldfield et al ., 1993). Murine retroviral vectors are used 
to reduce the risk of recombination with endogenous human se¬ 
quences. However, modified HIVs are under development for treat¬ 
ment of AIDS patients (Dropulic and Jeang, 1994). 

To prevent the modified virus from spreading when viral vectors are 
used for delivery of the therapeutic gene, replication is impaired by 
removing essential genes from the viral genome (Levine and Fried¬ 
mann, 1991a). The viral vector is thus capable of only one round of 
infection. The modified viral genome containing the therapeutic gene is 
first transfected into helper cells in the laboratory to produce infectious 
particles. Helper cell lines have been created by stable transfection 
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with virus genomes lacking packaging signals but expressing viral 
proteins necessary to package the modified viral genome into particles. 
Because recombination between the helper and vector sequences can 
regenerate replication-competent genomes, it is now common for all 
viral vectors to be designed to require multiple recombination events to 
acquire all the necessary genes for replication, thus minimizing the 
likelihood of generating a replication-competent virus. 

Adenovirus is also used as a vector, but it does not recombine with 
the genome and so the therapeutic gene can be lost upon cell division 
(Trapnell and Gorziglia, 1994). However, because adenovirus remains 
episomal, insertional mutagenesis cannot occur. Adenovirus can infect 
nondividing cells and is thus superior to retroviral vectors in some 
circumstances. Furthermore, loss of the episomal gene is not a concern 
in nondividing cells. Inhaled adenovirus vectors are currently used to 
treat cystic fibrosis. Adeno-associated virus, which can recombine with 
the genome and infect nondividing cells also shows promise as a vector 
(Koitin, 1994). Herpesvirus and vaccinia virus also remain episomal 
and have been used in some protocols to target specific tissue, such as 
liver and brain (Whitman et al., 1994; Efstathiou and Minson, 1995). 

Treating a disease caused by a defect in a single gene requires clon¬ 
ing the gene and understanding the function of its product. Depend¬ 
ing on the type of disease being treated, three strategies for gene thera¬ 
py can be adopted. Dysfunctional genes can be replaced, altered to 
become functional, or augmented with healthy genes working at differ¬ 
ent loci within the cell or within different cells (Lee and Klein, 1995). 
Of these options, gene augmentation therapy is currently the most 
plausible. There are several protocols currently in use to treat single¬ 
gene diseases, including cystic fibrosis, Gaucher’s syndrome, severe 
combined immunodeficiency resulting from loss of adenosine deam¬ 
inase (SCID-ADA), familial hypercholesterolemia, and hemophilia 
(Colledge,1994; Xu et al., 1994; Blaese et al ., 1993; Grossman et ai, 
1994; Yao et aL, 1991). Protocols for treating Duchenne type muscular 
dystrophy (DMD) as well as other muscular dystrophies are on the 
horizon (Morgan, 1994). 

There are no protocols in use to treat such diseases as diabetes, 
phenylketonuria (PKU), sickle cell anemia, or p-thalassemia even 
though the genes that cause these defects are known (Anderson, 1994). 
This is due in part to technical problems associated with designing 
some of the required vectors and in part to the diminishing number of 
individuals that are affected with these diseases. However, it is inevi¬ 
table that as gene therapy becomes more routine these so-called or¬ 
phan diseases will be targeted for treatment. 



90 


CORTNER, VANDE WOUDE, AND VANDE WOUDE 


1. Gene Therapy in Oncology 

The largest number of gene therapy protocols currently underway is 
in the field of oncology (Roth et al., 1994; Culver and Blaese, 1994). 
Gene therapy approaches to cancer treatment seek to kill tumor cells 
directly, mark cells for tracing the origins of recurrent disease, en¬ 
hance immunity to cancer cells, or protect vulnerable cells from chemo¬ 
therapy (see Table 4). 

One of the most ingenious methods used to kill tumor cells directly 
with gene therapy uses a murine retroviral vector to target the HSV- 
TK cells directly into brain tumor cells (Oldfield et al ., 1993). Because 
the cells surrounding the tumor are not dividing, the virus infects the 
dividing tumor cells selectively. Once the suicide gene is inserted, the 
tumor cells are killed by administering the drug ganciclovir. Gan¬ 
ciclovir is a nucleotide analog that is processed only by TK and not by 
other enzymes involved in nucleotide biosynthesis in normal human 
cells. TK action on ganciclovir produces toxic metabolites that kill the 
cell. Interestingly, neighboring cells that are not expressing TK are 
also killed, presumably by the movement of toxic metabolites through 
apoptotic vesicles, tight junctions, and local capillary beds. As a result 
of this bystander effect, not every cell within the tumor must receive 
the TK gene to be killed. Additionally, closely associated angiogenic 
cells within the tumor are killed as well, thereby shutting down the 
tumor’s blood supply and restricting the primary route of metastasis. 

Some gene therapy protocols are designed to achieve direct cell kill¬ 
ing by targeting the tumor cells with tumor-suppressor genes or with 
antisense oncogenes. For example, expression of the tumor-suppressor 
gene p53 in p53-deficient non-small-cell lung carcinoma (NSLC) cells 
has been shown to control their growth (Cai et al., 1993). Similarly, an 
antisense k-ras gene has been shown to control the growth of k-ras 
expressing NSLC cells (Zhang et al., 1993). Both of these observations 
have helped pave the way for the development of gene therapy proto¬ 
cols designed to impose tumor cell death or growth arrest. 

The marking of cells to trace the source of recurrent disease exem¬ 
plifies another useful application of gene therapy in oncology. For in¬ 
stance, patients with Acute Myeloid Leukemia (AML) or CML who 
have undergone irradiation to kill tumor cells require bone marrow 
transplants to survive (Brenner et al., 1993). Disease-free autologous 
bone marrow can be used for this purpose, but in the event of relapse, 
the donor cells are suspect. Marking these cells with a gene such as the 
neomycin resistance gene allows the researcher to determine whether 
the donor cells were the source of the recurrent tumor. 

Several applications of gene therapy are aimed at enhancing immu- 
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TABLE 4 

Oncology Gene Therapy Protocols 


Purpose 

Tumor 

Target Cel) 

Gene 

Marking 

AML 

WBM 

Neo 


CML 

WBM 

Neo 


ALL 

WBM 

Neo 


Neuroblastoma 

WBM 

Neo 


AML 

CD34 + 

Neo 


CML 

CD34 + 

Neo 


CLL 

CD34 + 

Neo 


Multiple myeloma 

CD34 + 

Neo 


Breast cancer 

CD34 + 

Neo 


Hodgkin’s 

CD34 + 

Neo 


Lymphoma 

CD34 + 

Neo 


Leukemia 

CD34 + 

Neo 


Melanoma 

TIL 

Neo 


Ovarian cancer 

TIL 

Neo 


Renal carcinoma 

CD34 + 

Neo 

Direct Cell Killing 

Ovarian cancer 

Tumor cells 

HSVtk 


Brain tumors 

Tumor cells 

HSVtk 


Brain tumors 

Tumor cells 

Antisense 

IGF-1 


NSCLC 

Tumor cells 

Antisense 

K-ras 


NSCLC 

Tumor cells 

Wild-type p53 

Enhanced Immunity 

Neuroblastoma 

Tumor cells 

IL-2 


SCLC 

Tumor cells 

IL-2 


Colorectal 

Tumor cells 

IL-2; TNFa; 
HLA-B7; 

02mg 


Solid tumors 

Tumor cells 
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nity directed at tumor cells. All of these approaches are based on the 
observation that increasing the immune response directed at a subset 
of the tumor cells increases the overall response to the entire tumor 
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(Deisseroth et al ., 1993). In one protocol, the HLA gene is inserted into 
HLA(—) tumor cells to enhance tumor antigenicity (Nabel et al ., 1992). 
In another protocol, cytokine gene is inserted into the tumor cells to act 
as a sort of tumor vaccination (Tepper and Mule, 1994). In a variation 
of this approach, genes for cytokines such as tumor necrosis factor 
(TNF), IL-2, and INF-y are inserted into tumor infiltrating lympho¬ 
cytes (TILs) obtained from patients with advanced melanoma and 
transfused back into those patients (Hwu and Rosenberg, 1994). Some 
advanced melanomas have responded successfully to these treatments. 
The efficacy of the TILs bearing the various modifications is currently 
being assessed. Gene marking experiments with TILs indicate that 
modified, reintroduced TILs can persist for up to two months, that 
tumors are targeted by TILs, and that modified TILs are safe. 

The prospect of protecting vulnerable cells against chemotherapeu¬ 
tic agents is another exciting application of gene therapy. In some 
breast cancer gene therapy protocols, bone marrow is transduced ex 
vivo with retroviral vectors carrying copies of the multidrug resistance 
(MDR) gene (O’Shaughnessy et al ., 1994). MDR functions to pump a 
wide variety of drugs out of cells; its amplification in some tumors 
results in the failure of chemotherapeutic approaches to treatment. 
Borrowing on this natural example, reintroduced bone marrow cells 
that have been modified to express several copies of MDR survive the 
high doses of chemotherapy needed to treat some aggressive breast 
cancers, thus increasing the chemotherapeutic dosage that the patient 
can withstand. 


C. Implications for Veterinary Medicine 

As alluded to in previous sections of this article, much of the current 
understanding of oncogene function has been augmented by the use of 
animal models. Many cancer-causing viruses, including retroviruses of 
chickens, cats, cattle, nonhuman primates, and mice (i.e., avian leuko¬ 
sis virus, Rous sarcoma virus, feline and bovine leukemia viruses, 
Mason Pfizer monkey virus, and murine mammary tumor virus, 
among others) and DNA viruses such as bovine papilloma virus, Mar- 
ek’s disease, rabbit fibroma virus, and a host of others have been stud¬ 
ied in animals (Fenner et al., 1987; Mohanty et al., 1981). This has led 
to better understanding of oncogene function and pathogenesis and in 
some cases has aided development of preventative measures and vac¬ 
cine therapy. To date, few studies have focused on oncogene or tumor- 
suppressor activation or deactivation in spontaneous or familial tu¬ 
mors arising in animals, though clearly the same mechanisms that 
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have been defined in human disease may be responsible for onco¬ 
genesis in animals. It is hoped that future research will concentrate on 
these aspects of neoplasia in order to provide better diagnostic, prog¬ 
nostic, and therapeutic capabilities to veterinary clinicians. 


VII. Conclusion 

During the past decade, great insights have been made into the 
mechanisms surrounding the pathogenesis of cancer. Such strides 
have been aided by study of DNA-mediated gene transfer, oncogenic 
DNA and RNA viruses in animals, and evaluation of genes and chro¬ 
mosomes in families with apparently heritable forms of cancer. New 
molecular biology methodologies, including gene mapping and cloning, 
as well as the ability to induce mutations in mice through the use of 
transgenic and knock-out technology, have contributed significantly to 
these studies. Through this work, four categories of genes have been 
identified that are involved in oncogenesis: (1) oncogenes, which nor¬ 
mally function to promote cellular growth and division in a pro¬ 
grammed manner, but which can trigger unregulated proliferation 
when some aspect of their function or expression is altered; (2) tumor- 
suppressor genes, which normally act to suppress cellular growth and 
which can lead to uncontrolled proliferation when inactivated; (3) 
genes encoding DNA repair enzymes, which when dysfunctional lead 
to increased accumulation of mutations in cancer-predisposing genes; 
and (4) metastasis-associated genes, which when activated allow can¬ 
cer cells to disseminate and establish tumors in sites peripheral to the 
original neoplasm. Establishment of the genetic basis of oncogenesis 
has provided an opportunity for new, highly accurate diagnostic meth¬ 
odologies to be developed and hopefully will result in improved meth¬ 
ods of prevention and treatment for neoplastic disease. In addition, 
these discoveries, made possible by the use of animal models, may also 
ultimately be used to improve understanding and treatment of neo¬ 
plasms commonly encountered in companion and large animal veterin¬ 
ary medicine. 
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I. Introduction 

Developments in molecular genetics are enlarging the scope of ap¬ 
proaches to questions involving inherited traits or diseases in dogs. 
Current efforts are limited only by the amount of basic genetic knowl¬ 
edge about the canine genome, which is growing steadily (see “The 
Canine Genome”), and specific molecular data about individual Men- 
delian traits in dogs. These are only relative limitations, however, be¬ 
cause of the increasing interest in and possibilities for exploiting paral- 
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lei developments in human and murine genetics. We will consider sev¬ 
eral examples that show the value of using these genetic data from 
other mammals for comparative studies in dogs. 

Molecular genetics is based on detecting locus-specific differences 
between individuals. The loci studied may be single genes of known 
identity, which necessarily occupy specific chromosome locations, or 
unique but anonymous chromosomal position markers identified only 
by their deoxyribonucleic acid (DNA) sequences. The latter are more 
powerful analytically because they frequently show greater interin¬ 
dividual variations; they are the basis for the best current human 
(Gyapay et al ., 1994) and murine (Dietrich et al., 1994) linkage maps. 
Unfortunately, these anonymous markers cannot generally be related 
directly to specific biologic function(s). By contrast, the former are 
directly related to—and frequently derived from—specific genes and 
can be used to identify the precise mutation(s) responsible for specific 
traits. Ideally, and in the forseeable future, both sorts of loci will be 
interrelated in the canine genetic map as has been the case for maps of 
humans and mice. Currently, however, most Mendelian conditions in 
dogs are delineated on the basis of detailed study of their individual 
responsible genes. 


II. Methods 

A. Restriction Fragment Length Polymorphisms 

DNA-based gene analysis can proceed by several established meth¬ 
ods. The first method relies on producing site-specific cleavage of chro¬ 
mosomal DNA with restriction enzymes and separating the thousands 
of resulting fragments by gel electrophoresis. The array of fragments 
can then be transferred to a supporting filter membrane, becoming a 
so-called Southern blot. Such a blot then can be used to determine the 
size (length) of immobilized DNA fragment(s) that contain(s) se¬ 
quences identical (or sufficiently similar, depending on test conditions) 
to a specific DNA test sequence or probe. Exposing the blot to the probe 
permits alignment of the complementary nucleotide sequences and the 
size(s) (length) of the blotted fragment(s) can be determined by follow¬ 
ing the label (radioactivity, enzyme activity, antigenicity, etc.) on the 
probe. 

The critical variable in such a study is the size of the hybridizing 
fragment(s) on the Southern blot. Variation(s) in fragment size (length) 
between individuals is (are) the basis for identifying a restriction frag- 



DIAGNOSTIC MOLECULAR GENETICS 


105 


ment length polymorphism (RFLP). RFLPs have served as the basis 
for much gene discovery and mutation detection. Each RFLP, however, 
relies on an imprecise measurement (the specific length of a DNA 
fragment, which may comprise thousands of base pairs) and may fail to 
detect many different DNA base changes (if the change does not dis¬ 
rupt the restriction enzyme recognition sites, which determine the end 
of the DNA fragment recognized by the probe, it will be invisible by 
this technique). Furthermore, an important DNA change, such as a 
point mutation, may not grossly alter the length of the fragment. An 
additional technical difficulty is that extensive RFLP analysis requires 
a substantial amount of DNA. 

B. Polymerase Chain Reaction 

The second approach to DNA-based gene analysis uses variations of 
the basic technique of copying any given DNA fragment by the poly¬ 
merase chain reaction (PCR). PCR can use a very small amount (in 
theory, a single molecule) of DNA (target DNA) and generate large 
numbers of copies by geometric amplification. PCR products can be 
tested directly for length by conventional gel electrophoresis. They also 
can be subjected to electrophoresis in denaturing environments, which 
cause the two DNA strands to separate and thus reveal nucleotide 
sequence changes that have not affected the length of the fragment but 
which may alter the mobility of the single strands. PCR products may 
be combined with automated DNA sequencing technology to determine 
the actual nucleotide sequence of a specific region of an individual’s 
genome and thus the presence of small changes can be detected. Varia¬ 
tions on the basic PCR technique can provide specific, rapid testing for 
mutations that have been identified previously. 

PCR can be used to study both anonymous DNA sequences and those 
for which the function has been defined. Examples of the former are 
simple DNA sequence repeats (e.g., (CA)„ where n can vary widely 
[e.g., Sack and Talbot, 1991]). Such dinucleotide repeats are spread 
throughout mammalian genomes. Other repeats may have tri- or tetra- 
nucleotide units; the latter have been most useful (because they show 
considerable variability) in canine genome studies. These repeated 
sequences are without known function but have distinct chromo¬ 
somal positions defined by the unique stretches of DNA that flank 
them. Despite their anonymity, such repeat sequences are very valu¬ 
able for linkage studies. Several groups are developing and character¬ 
izing these markers for the canine genome (Ostrander et al ., 1993; 
Ostrander et al, 1995; Holmes et al., 1993; Mellersh et al., 1994; 
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Fredholm and Winterp, 1995). Undoubtedly, these will form the basis 
of the first canine molecular genetic linkage map. 

PCR also can be used to study single, defined regions of individual 
genes. Regional DNA changes can then be related directly to specific 
positions in their respective genes. This can permit detailed piece-by- 
piece analysis of large genes because PCR is most useful for relatively 
short lengths of DNA. PCR testing will remain the most flexible and 
far-reaching approach for genomic research for an extended time. 


C. Mapping 

To be maximally useful, individual DNA markers must be related to 
each other as well as to specific canine chromosomes. Dogs have 38 
pairs of autosomes and a pair of sex chromosomes. These have been 
difficult to study because many are quite small. However, a consensus 
karyotype is now available and each can be distinguished (Selden et 
al ., 1975). 


1. Chromosome Assignment 

The first task in mapping is to assign individual markers to their 
respective chromosomes. This can be done by several methods (dis¬ 
cussed in more detail in “Mapping Animal Genomes” and “The Canine 
Genome”). Cell lines can be established carrying small numbers of 
individual canine chromosomes on a background of, for example, the 
entire mouse or hamster genome. Such heterokaryons collectively con¬ 
tain all canine chromosomes. Many such cell lines are then prepared 
and the specific canine chromosomes that each line contains are deter¬ 
mined. DNA from each cell line can be isolated and PCR testing can 
show which line(s) contains the specific canine gene or DNA marker. 
Referring the test results to the chromosome complement of each cell 
line can then show which chromosome contains which marker. Because 
of the large number of canine chromosomes, 75-100 hybrid cell lines 
will likely be needed for these studies. Although much work is involved 
in preparing and characterizing the hybrid cell lines, their DNA can 
serve as a permanent and standardized reference for all subsequent 
studies. 


2. In Situ Hybridization 

Another approach to assigning genes and anonymous markers to 
specific chromosomes uses in situ hybridization. Here, the probe corre¬ 
sponding to the locus of interest and labeled with an identifying dye is 
hybridized to a metaphase array of chromosomes from either a hetero- 
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karyon or a regular canine cell. The single site where the probe hybrid¬ 
izes can then be determined by light microscopy. 

In situ hybridization can identify the chromosome to which a probe 
corresponds (the chromosome must be distinguishable by other criteria 
as well). Also, it is frequently possible to determine the relative posi¬ 
tion of the probe on the chromosome—e.g., telomeric, centromere-prox¬ 
imal or -distal. In ideal circumstances, the relative positions of differ¬ 
ent probes can be determined simultaneously when each is labeled 
with a dye of a different color. 

A third approach to assigning markers to specific chromosomes (and 
relative positions on those chromosomes) uses linkage methods as de¬ 
scribed in the following section. 

These approaches are complementary and each builds on informa¬ 
tion supplied by the others. Canine genome mapping will begin by 
identifying a few genes or polymorphic markers on each chromosome; 
others will be added subsequently as positional relationships are de¬ 
termined. The initial goal (to be useful for subsequent efforts) is to 
construct a map with markers at intervals of 10 centimorgan (cM). 
This likely will require about 300 markers, allowing for redundancy 
and clustering, and will be the basis for the linkage studies proposed in 
the next section. 


IIL Linkage Analysis 


A. Background 

Genomic linkage studies are based on analyzing the passage of spe¬ 
cific genes (or anonymous DNA markers) through families (or kin¬ 
dreds). Such studies ask whether any two identifiable traits (e.g., 
anonymous DNA markers, variations in known genes, or clinical fea¬ 
tures) are passed together in a kindred more frequently than would be 
expected on the basis of chance alone. The more frequently these traits 
are transmitted together through a kindred on the basis of simple 
pedigree analysis the more closely their responsible DNA locations will 
be on the actual genetic map. Efficient use of linkage analysis presup¬ 
poses a set of highly variable markers throughout the genome which 
can be used for testing; DNA-based markers have been discussed pre¬ 
viously. In theory, any trait or marker that has Mendelian segregation 
can be placed spatially on the growing linkage map for the organism. 
The limiting requirements are a sufficiently extensive kindred for 
analysis of the trait or marker and an adequate set of variable markers 
for comparison with the trait or marker of interest. With respect to the 
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kindred, a minimum of three generations is usually needed, although 
this depends on the structure of the kindred. In terms of the genetic 
markers, it is important to note that many individual canine breeds 
already represent genetic isolates because of purposeful inbreeding, 
which may have reduced intrabreed genetic variation substantially 
(see Section IV,B). This may make it more difficult to find large num¬ 
bers of highly variable genetic markers for certain breeds. The growing 
number of polymorphic markers now being developed should reduce, 
but not necessarily eliminate, this problem. 


B. Positional Cloning 

Linkage analysis has been used effectively for human diseases that 
have Mendelian segregation patterns but for which the underlying 
causes are unknown. In human medicine, a particularly vivid example 
is provided by Huntington’s disease in which the early finding linking 
the trait to an anonymous, variable DNA marker preceded discovery of 
the actual gene by a decade (Gusella et al 1983; The Huntington’s 
Disease Collaborative Research Group, 1993). Isolating the responsi¬ 
ble gene beginning with a chromosomal location has been called posi¬ 
tional cloning . 

An important use of linkage analysis is to establish the chromosomal 
location for a given genetic trait. Often, this can serve as the first step 
in identifying and isolating the responsible gene. Examples of using 
this approach in humans include Huntington’s disease, as previously 
noted, as well as neurofibromatosis and cystic fibrosis (The Hunt¬ 
ington’s Disease Collaborative Research Group, 1993; Wallace et al., 
1990; Riordan et al ., 1989). Once a chromosomal location has been 
identified, subsequent efforts are directed to examining genes known 
to be in that chromosomal region as possible candidate genes to ex¬ 
plain the disease. If no obvious candidate gene exists, recombinant 
DNA techniques (e.g., positional cloning) can be used to search for 
genes in the appropriate region and examine them for mutation(s) that 
might explain the disease. 

In linkage studies, it is important that an identifiable marker in or 
near the candidate gene must move consistently (segregate) through 
the kindred with the trait of interest. Finding divergence between 
kindred segregation patterns of the DNA and gene marker change(s) 
and the trait of interest excludes that change from explaining the 
condition. Conversely, finding consistent cosegregation between the 
DNA marker and the trait is supportive—but not necessarily diagnos¬ 
tic—of a direct relationship between them. These important aspects of 
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using linkage analysis emphasize the value of large, well-defined kin¬ 
dreds to permit gathering statistically significant linkage segregation 
data and to support the identification of unequivocal gene positions 
before extensive molecular searches are undertaken. 

C. Candidate Genes 

A different but related application of linkage analysis is its use in 
candidate gene studies. Here, one can begin with a reasonable hypoth¬ 
esis about the pathophysiologic basis of the trait of interest; for exam¬ 
ple, an apparent enzyme deficiency. Next, a group of DNA clones is 
isolated representing potentially responsible genes. These clones are 
then studied for changes or mutations in affected animals. If such 
changes are found, the segregation patterns of the trait and potentially 
responsible gene change(s) can be analyzed for segregation in the kin¬ 
dred. Failure to find cosegregation excludes the candidate gene(s). 
Finding cosegregation strengthens the hypothesis. Here it is essential 
that a kindred be available for confirmatory studies. Another value of 
studying a large kindred in this context is that it provides reasonable 
assurance that only a single trait is present. The presence of a clini¬ 
cally similar but genetically distinguishable trait (phenocopy) compli¬ 
cates the analysis and can lead to rejecting a legitimate mutation if it 
is not present in all of the overtly similar animals. 

This approach to discovering genes responsible for specific traits 
depends on the availability of DNA clones of canine genes or their close 
counterparts from other mammals. Our recent study of the cross-reac¬ 
tivity of a group of single-copy human gene clones with canine DNA 
indicated that at least a third failed to give specific hybridization pat¬ 
terns, presumably because of significant divergence between human 
and canine DNA sequences (Sack et al., 1996). Thus, at least in some 
cases, clones for the actual canine genes may be needed to perform 
effective candidate gene studies in dogs. By contrast, some genes iso¬ 
lated from humans are very similar to their canine counterparts and 
cross-react readily. This cross-reactivity between the gene DNA se¬ 
quences was fundamental to studies of golden retriever muscular dys¬ 
trophy (Sharp et al., 1992) as well as to studies of phosphofructokinase 
deficiency in English springer spaniels (Giger et al., 1994), proteolipid 
protein in Welsh springer spaniels (Duncan and Nadon, 1994), clotting 
factor IX in Labrador retrievers in which the gene was deleted (Brooks, 
1994) as well as others where a point mutation was present (Evans et 
al., 1989), and a5(IV) collagen in Samoyed X-linked hereditary nephri¬ 
tis (Zheng et al., 1994). 
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D. Quantitative Trait Loci 

Another application of linkage analysis relates to many of the more 
common diseases of dogs. Canine hip dysplasia is a prominent example 
of a condition of widespread interest that has both a recognized spec¬ 
trum of clinical severity (e.g., age of onset, progression) and also breed- 
specific features (Hedhammer et al., 1979; Willis, 1989; Smith et al., 
1990). An extensive DNA-based marker map of the canine genome can 
be used with extended kindreds to identify the chromosomal locations 
of genes responsible for at least some of the variation encountered 
clinically. Mathematical approaches are being developed to quantitate 
the contributions of individual genes to the final clinical picture. Genes 
or DNA regions contributing to such common traits are referred to as 
quantitative trait loci (QTL) and may (at least initially) be anony¬ 
mous—i.e., identified by chromosomal position alone. Here, in addition 
to a reasonably dense gene map of informative markers, it is essential 
to have extensive kindreds and a consistent set of criteria for measur¬ 
ing and defining the severity of the admittedly variable condition. 

QTL encompass some of the most common traits and disorders in 
dogs, including temperament, stature, and tumor susceptibility. Iden¬ 
tifying critical genes will be very important in counseling breeders and 
owners. At least in principle, there is no reason why using this ap¬ 
proach in dogs will not be as informative as, for example, QTL studies 
of hypertension in rats (Jacob et al., 1991). Despite the importance of 
this approach, however, it will be a labor-intensive undertaking, crit¬ 
ically dependent on the prerequisites noted in this section. 


IV. Single-Gene Disorders 

Molecular biologic approaches are most directly applicable to this 
category of traits. This also is the area in which progress is likely to be 
most rapid. The speed of developments in this area will reflect several 
factors: (1) There already are many well-defined Mendelian traits in 
dogs and at least some implicate malfunction of specific genes. (2) 
Breeding records can serve as starting points for kindred analysis and 
many study kindreds have been assembled. (3) Many DNA clones of 
single genes are being isolated in other organisms (particularly hu¬ 
mans and mice). At least some of these should provide excellent start¬ 
ing points for studying their canine counterparts. (4) The technology of 
gene analysis (outlined in Section II) is directly applicable to studies in 
dogs. 
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A. Canine Mutations 

Table 1 lists nine canine Mendelian disorders that already have been 
studied by molecular techniques and for which actual nucleotide 
change(s) have been identified. In most of these, the responsible gene 
was already known or strongly implicated on clinical or metabolic 
grounds (for instance, the dystrophin gene in golden retriever muscu¬ 
lar dystrophy [Sharp et al ., 1992], phosphofructokinase deficiency in 
English springer spaniels [Giger et al ., 1994], pyruvate kinase deficien¬ 
cy in Basenjis [Whitney et al., 1994], and cx-L-iduronidase deficiency in 
Plott hound mucopolysaccharidosis I [Menon et al., 1992]). These re¬ 
sults are immediately applicable to clinical studies, such as those for 
presymptomatic testing of animals at risk and for carrier screening in 
breeding populations. 

Many conditions originally described in humans as single-gene dis¬ 
orders actually represent different mutations within the responsible 
gene (for example, p-glucocerebrosidase in Gaucher disease [Horowits 
et al., 1993], neurofibromin in neurofibromatosis [Wallace et al., 1990], 
and the cystic fibrosis transmembrane regulator in cystic fibrosis [The 
cystic fibrosis genotype-phenotype consortium, 1993]). This situation, 
which may be more the rule than the exception in humans, makes 
carrier screening and presymptomatic detection in outbred human 
populations considerably more difficult because one cannot tell a priori 
which mutation to anticipate, 

B. Canine Genetic Homogeneity 

In studies of canine genes, by contrast, the long history of selective 
breeding has turned many breeds into “genetic isolates” and substan¬ 
tially homogenized their gene pool. This was shown by our recent stud¬ 
ies of RFLPs in a large colony of Brittany spaniels in which we exam¬ 
ined the autoradiographic signals developed from hybridizing 17 
single-copy human DNA sequences against conventional Southern 
blots of canine DNA (Sack et al., 1996). Ten of these 17 probe DNAs 
represented so-called anchor-loci, which have been proposed for com¬ 
parative mammalian gene mapping and which can reasonably be ex¬ 
pected to be conserved between mammals (O’Brien et al., 1993). De¬ 
spite considerable variation in hybridization conditions, only 8 of the 
17 human DNA clones gave clear cross-hybridication signals and only 
5 of these 8 showed diallelic RFLPs in the kindred. No multiallelic 
patterns were encountered using these probes. Unfortunately, diallelic 
RFLPs are of only limited value in kindred studies because they show 
relatively few variants. 



TABLE 1 

Mutations in Single-gene Canine Disorders 


Disorder 


Breed 


Mutation 


Reference 


X-linked severe combined 
immunodeficiency 
Muscular dystrophy 
Shaking pup syndrome 
Phosphofructokinase deficiency 
Rod-cone dysplasia I 
Hemophilia B 

Pyruvate kinase deficiency 
Mucopolysaccharidosis I 


Basset hound 
Cardigan Welsh corgi 
Golden retriever 
Welsh springer spaniel 
English springer spaniel 
Irish setter 
Geulph cairn terrier 
Labrador retriever 
Basenji 
Plott hound 


X-linked hereditary nephritis Samoyed 


4 bp" deletion in IL-2Ry 
I bp insertion 
A-»G in dystrophin 
Point mutation in proteolipid protein 
G—>A at 2228 in phosphofructokinase 
G->A at 2420 in cGMP ;> phosphodiesterase B 
G—>A at 1477 in Factor IX 
Factor IX deletion 
Deletion of C 4;}i '* in pyruvate kinase 
G^A in donor splice side of intron 1 of u- 
L-iduronidase 

G^T in a5(IV) collagen chain 


Henthorn et al 1994 

Sharp et al ., 1992 
Duncan and Nadon, 1994 
Giger et al. t 1994 
Ray et al 1994 
Evans et al., 1989 
Brooks, 1994 
Whitney et al., 1994 
Menon et al., 1992 

Zheng et al., 1994 


a bp: base pair 
'•cGMP: 
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These findings imply that we may expect considerable genetic homo¬ 
geneity within individual canine kindreds. Thus, at least within 
breeds, it is likely that individuals affected with simple Mendelian 
disorders will have the same mutation because there will be a concen¬ 
tration of heterozygous carriers. Another consequence of this high fre¬ 
quency of heterozygotes is the likely appearance of individuals homo¬ 
zygous for recessive conditions, which would be rare in outbred 
populations. For example, the large kindred of Brittany spaniels as¬ 
sembled for studies of hereditary canine spinal muscular atrophy 
(HCSMA [Cork et al., 1979]) also has been found to segregate a defi¬ 
ciency of the third component of complement (Winkelstein et al., 1981) 
as well as a recessive form of cleft palate (Richtsmeier et aL, 1994). 
These three traits are unlinked in Brittany spaniels. Recognizing this 
inbreeding should simplify screening and counseling because fewer 
mutations will need to be assessed once the breed-specific mutation(s) 
has (have) been identified. Mutations causing the same clinical picture 
(phenocopies) may differ between different breeds, however. Recall 
that hemophilia B (clotting Factor IX deficiency) shows a point muta¬ 
tion in Guelph cairn terriers and a deletion in Labrador retrievers 
(Evans et al., 1989; Brooks, 1994). 

C. Technical Considerations 
1. Analytic Methods 

As noted previously, there are no unique technical barriers to gene- 
based studies in dogs, especially when the studies are based on PCR. 
PCR permits the use of very small amounts of DNA, which can come 
from blood, skin, tail, hair bulb, and so on. Approaches such as conven¬ 
tional RFLP analysis require large amounts of blood or biopsy material 
and there is yet no reliable method for propagating canine lymphocytes 
in culture without using noncanine viruses. (By contrast, preparing 
lymphoblasts by transforming peripheral lymphocytes is a widely used 
method for studying human DNA beginning with only a small blood 
sample.) The analytical barriers are those of identifying mutations 
using the methods described earlier. Specific mutations are likely to be 
discovered by individual laboratories concentrating on intensive study 
of single illnesses or traits. Such laboratories generally will have col¬ 
lected sufficiently informative kindreds and affected individuals to per¬ 
mit detailed studies and also may have obtained pathophysiologic 
data, which can aid their search by implicating candidate genes. This 
has been a frequent pattern for similar studies of human and murine 
mutations. 
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2. Data Management 

As observations from various laboratories accumulate, it will be es¬ 
sential to organize the data and make them readily accessible to all 
workers. Of necessity, this will be computer based, preferably on-line, 
and should be able to exploit the data-handling methods of the human 
and murine genome projects. Having open access to all data will speed 
this work considerably. It also will minimize duplication of efforts, 
which is inefficient and wasteful of limited resources. The DOGMAP 
internet site is a prototype for this approach (Dolf et aL, 1994). A 
particularly valuable method for human gene studies has been the 
publication of sequence-tagged sites (STSs), which are pairs of unique 
PCR primer DNA sequences flanking a defined genetic region (gene, 
anonymous marker, etc. [Olson et aL, 1989]). Making all canine STS 
data available to all workers will permit quick dissemination of essen¬ 
tial genomic information, which can be used by any laboratory inter¬ 
ested in a particular gene region. Hudson et al. (1995) published a 
human gene map based on 15,086 STSs. DOGMAP includes sequence 
identification for unique, polymorphic canine markers (Dolf et al., 
1994). 


V. Future Directions 

Canine molecular genetics is in its infancy. The field will grow rap¬ 
idly based on considerations noted in this article with special contribu¬ 
tions from parallel efforts such as the Human Genome Project. Impor¬ 
tant developments can be expected in the following: (1) delineating the 
molecular bases for single-gene diseases in dogs (this clearly will move 
most rapidly), (2) establishing a readily accessible public molecular 
genetic marker map for the canine genome (likely to be based on 
simple-sequence repeat polymorphisms spread across all canine chro¬ 
mosomes [Ostrander et al 1993, 1995; Holmes et al., 1993; Mellersh et 
al., 1994]), and (3) dissecting complex canine traits based on contribu¬ 
tions from individual Mendelian loci discovered by quantitative link¬ 
age analysis. 

The canine genome project will be particularly valuable for dissecting 
neurologic and behavior traits. Dogs permit detailed neurologic exam¬ 
inations, exposing fine nuances of clinical, behavioral, and neuro¬ 
physiologic changes. HCSMA in Brittany spaniels provides an excellent 
example of the details that can be established (Cork et al., 1979) by 
thorough study. Many specific traits have been identified in other breeds 
and all are candidates for genetic and molecular dissection. Such stud¬ 
ies will identify genes important for neuronal function. Still other, more 
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broadly defined features (docility, herding behavior, aggressiveness), 
while undoubtedly polygenic, show sufficiently breed-specific features 
to be approached as quantitative trait loci (see Section II,D) so at least 
some individual genetic contributions to them will be identified. 

A particularly important outgrowth of this work will be a central 
data collection and coordination service. This will prevent duplicated 
efforts (because resources are limited) and also can serve as a referral 
center for specific diagnostic tests performed by individual laborato¬ 
ries. The rarity of some single-gene canine traits may prevent molecu¬ 
lar-diagnostic studies from being financially self-supporting. Thus, 
proprietary, investigator-based diagnostic services may remain valu¬ 
able for some time. Moreover, having additional numbers of referrals 
for the diagnosis of single-gene canine disorders in specialized labora¬ 
tories will likely improve the breadth of experience and reliability in 
each laboratory and may identify additional mutations as has been the 
case in diagnostic laboratories for human genetic diseases (e.g., Wal¬ 
lace et al., 1990; Riordan et al., 1989; Horowits et al., 1993; The cystic 
fibrosis genotype-phenotype consortium, 1993; Lei et al., 1995). 

Combining the wide variation of canine breed characteristics and 
numerous specific genetic traits already recognized with molecular 
analytic tools will undoubtedly broaden the understanding of canine 
biology. 
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I. Introduction 

A. Clinical Manifestations 

Hemophilias A and B are X-linked recessive bleeding disorders 
caused by molecular defects in the genes for factors VIII and IX, re¬ 
spectively. The incidence of the disease is about 25 in 100,000 males, 
with hemophilia A accounting for about 80% of the cases. Although the 
two forms of hemophilia result from mutations in two different genes, 
they are indistinguishable clinically. The severity varies as a function 
of the activity level of the missing or defective factor. Patients with 
F.VIII or F.IX levels less than or equal to 1% of normal are severely 
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affected, generally requiring clotting factor infusion as often as two to 
four times a month as treatment for bleeding episodes. Spontaneous 
bleeding into the joint space is the most common manifestation of the 
disease and over a period of years can result in significant arthropathy. 
The most serious complications arise from bleeding into closed spaces, 
e.g., intracranial, retroperitoneal, and epidural or retropharyngeal 
bleeds; these can be fatal if not treated quickly. Patients with F.VIII or 
F.IX levels from 1% to 5% have more moderate disease with sponta¬ 
neous bleeds occurring less frequently. Those with levels from 5% to 
15% have a mild phenotype and generally experience bleeds only in the 
setting of trauma or surgery. Mild forms of the disease may not be 
diagnosed until adult life. It is the bleeding into the joints that causes 
the major morbidity of the disease and often leads to crippling chronic 
arthritis and arthropathy. Because hemophilia patients have normal 
platelet function, they do not manifest excessive hemorrhage from mi¬ 
nor cuts and abrasions. 

B. Structure, Biology, and Genetics 

The genes for both F.VTII and F.IX are located on the X-chromosome 
at positions q28 and q27, respectively. The gene for F.VIII, one of the 
largest genes known, is composed of 26 exons and spans 186 kilobases 
(kb) (Gitschier et al., 1984). The F.VIII messenger ribonucleic acid 
(mRNA) is about 9 kb. The complete sequence of the gene has been 
determined (Gitschier et al., 1984; Vehar et al., 1984) and numerous 
mutations, including gross and partial gene deletions, as well as point 
mutations, have been identified in hemophilia A patients (Tuddenham 
et al., 1991). The gene for F.IX is about 34-kb long and consists of eight 
exons and seven introns (Yoshitake et al., 1985). Before secretion of the 
mature protein, F.IX must undergo cleavage of its signal and propep¬ 
tide sequences, as well as y-carboxylation of 11 glutamic acid residues 
at the N-terminus of the protein. The mature protein is secreted by 
hepatocytes as a single-chain polypeptide consisting of 415 amino ac¬ 
ids. Expression of functional FIX from target cells in gene therapy 
requires that these cells perform necessary posttranslational process¬ 
ing functions efficiently. 


II. Advantages of Gene Therapy as a Treatment for Hemophilia 

Current treatment for hemophilia consists of intravenous infusion of 
clotting-factor concentrates, either plasma derived or recombinant. Be¬ 
cause of the expense of clotting-factor concentrates (adults with severe 
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disease typically spend $50,000-$100,000 per year on concentrates), 
infusions are generally given only in response to bleeds, not prophylac- 
tieally. Thus, during the inevitable delay between the onset of a bleed 
and the infusion of factor, tissue damage results. Gene therapy poten¬ 
tially could provide sustained synthesis of the clotting factor from ge¬ 
netically modified cells, so a constant level of circulating factor would 
result and bleeds could be prevented rather than treated. Gene thera¬ 
py also has the potential to provide a less expensive alternative to 
infusion of clotting-factor concentrates, as it should obviate the need 
for frequent treatments. A major problem with plasma-derived concen¬ 
trates has been the transmission of blood-borne pathogens, especially 
hepatitis viruses (Troisi et al., 1993) and the human immunodeficiency 
virus (HIV) (Goedert et al., 1989). Current concentrates, however, are 
essentially safe because efficient methods for heat-inactivation, anti¬ 
body-purification, and virucidal treatments have been developed 
(Schimpf et al., 1987, 1989). 


III. Advantages of Hemophilia as a Model for Gene Therapy 

Compared to other genetic and acquired disorders for which gene 
therapy has been proposed or contemplated, hemophilia has several 
features that make it an appealing model in which to attempt gene 
therapy. For example, the hemophilias do not require precise regula¬ 
tion of the levels of expression of factors VIII and IX. Levels as low as 
5% of normal would have a substantial therapeutic effect, and levels 
as high as 200% of normal have not been associated with ill effects in 
patients infused with clotting-factor concentrates. This is at least 
partly because the factors circulate in the plasma as inactive precur¬ 
sors that become activated only when the coagulation cascade is initi¬ 
ated. Another advantage of hemophilia as a model for gene therapy is 
that there is no requirement for tissue-specific expression. Although 
the clotting factors are normally synthesized in the liver, work by a 
number of investigators has established that biologically active F.IX 
can be synthesized in a variety of other tissues, including fibroblasts, 
endothelial cells, and myoblasts (Dai et al., 1992; Palmer et al., 1989; 
Yao and Kurachi, 1992). Similar findings have been reported for 
F.VIII (Eaton et al., 1987). Finally, large- and small-animal models 
exist that faithfully mirror the human disease. For both hemophilia A 
and B, there are naturally occurring dog models of the disease. Ca¬ 
nine hemophilia colonies are maintained at several centers in North 
America; the canine disease is X-linked and is characterized by fre¬ 
quent spontaneous bleeding episodes similar to those seen in hu- 
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mans. The canine F.IX copy deoxyribonucleic acid (cDNA) has been 
cloned (Evans et al ., 1989a), but the canine F.VIII cDNA is not yet 
available. As is the case for humans, the molecular defects leading to 
canine hemophilia appear to be heterogeneous. Evans et al. (1989b) 
have delineated the defect causing severe hemophilia B (F.IX activity 
<1% normal) in the Chapel Hill colony; these animals exhibit a Gly 
—> Glu substitution at residue 379, which appears to result in an un¬ 
stable protein that is destroyed intracellularly (i.e., is not secreted). 
Other variants are caused by gene deletions or other point mutations 
(Marjorie Brooks and James Catalfamo, New York State College of 
Veterinary Medicine, Cornell University, Ithaca, New York, un¬ 
published observations). Mutations causing canine hemophilia A have 
not yet been reported. A mouse model of hemophilia A was created by 
knock-out technology (Bi et al ., 1995), and similar models are in pro¬ 
gress for hemophilia B (Darrel Stafford, UNC-Chapel Hill, personal 
communication). The hemophilia A mice have <1% F.VIII levels by an 
activity assay. They do not appear to bleed spontaneously but exhibit 
severe (lethal) bleeding following routine tail vein sampling. The 
availability of both large- and small-animal models constitutes an un¬ 
usual (though not unique) resource for the study of a genetic disease; 
the animal models have been used extensively for trials of gene thera¬ 
py {vide infra). 


IV. Choice of Vector 

Vectors used in gene therapy for hemophilia must transfer the gene 
of interest efficiently and safely and must either result in long-term 
expression or allow for repetitive administration. It should also be 
noted that there is a distinction between gene repair, in which the 
defective gene is replaced by a normal copy of the gene, and gene 
insertion, in which the defective copy remains in the target cell, but a 
normal copy is inserted elsewhere within the host genome or the cell 
nucleus. With current techniques, any method of gene repair therapy 
remains speculative because the efficiencies in vitro and in vivo are 
much too low to achieve any substantial level of gene correction. Thus, 
all current approaches to gene therapy are gene insertion techniques. 

Vectors currently under study for gene therapy can be divided into 
two large groups: viral and nonviral vectors. Among viral vectors, most 
work has been focused on retroviral, adenoviral, and adeno-associated 
virus (AAV) vectors. Most nonviral techniques are based on liposomes 
that can encapsulate large DNA sequences, and techniques in which 
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Fig. 1. Structure of a retroviral vector expressing human factor IX. (A) Segments of 
the retroviral genes, gag, pol, and env, have been excised and (B) replaced by the human 
F.EX cDNA and a selectable marker neo under the control of the SV40 promoter. Tran¬ 
scription of the gene of interest, in this case F.IX, is controlled by a promoter in the long 
terminal repeat (LTR). 


the DNA is bound to polylysine together with adenoviral particles and 
cell receptor ligands. 


A. Retroviral Vectors 

Retroviral vectors have generated a great deal of interest because 
they are capable of transferring genetic material stably into a cell’s 
genome and subsequently expressing it in a manner that is generally 
not detrimental to the host cell. Most of the currently used retroviral 
vectors are derived from the Moloney murine leukemia virus 
(MoMLV). Because parts of their coding regions are replaced by the 
gene to be transferred (Fig. 1), they are replication defective and have 
to be grown in special cell lines (packaging cells) that supply the struc¬ 
tural and replicative proteins in trans . However, co-infection with help¬ 
er virus can lead to viral replication, which may cause serious side 
effects such as in vivo tumorigenesis as has been observed in some 
animal models (Donahue et al., 1992). Thus, for any human application 
of the virus, extensive testing for helper virus contamination must be 
carried out. There is no minimum requirement for the size of the insert 
in a retroviral vector; the maximum size insert that can be accommo¬ 
dated is approximately 7.5 kb (Morgenstern and Land, 1991). 

The majority of published gene transfer studies use recombinant 
retroviruses. These vectors are capable of stably transducing up to 
100% of the target cells in tissue-culture experiments. However, repli¬ 
cation of the target cell is necessary for efficient transduction and pro- 
viral integration into the host cell genome to occur; this feature limits 
the in vivo utility of the vectors in many instances. Another problem is 
the instability of the retroviral particles. This has made it difficult to 
purify or concentrate recombinant retroviruses to high titers. 
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B. Adenoviral Vectors 

A number of recombinant adenoviral vectors expressing different 
transgenes have been developed and shown to deliver genes with high 
efficiency to a broad range of target cells (reviewed in Kozarsky and 
Wilson (1993)). These vectors are of special interest because they are 
easily prepared in very high titers, are capable of transducing non¬ 
dividing cells in vitro and in vivo , and direct high levels of expression of 
the transgene. Integration into the host cell genome does not appear to 
be an integral part of the adenoviral life cycle. Rather, the viral DNA 
enters the nucleus, where it remains episomal. Most current recombi¬ 
nant adenoviral vectors are derived from human adenovirus type 2 or 
5, which cause respiratory disease in humans but which have not been 
associated with any human malignancies (Strauss, 1984). They con¬ 
tain deletions of parts of their genome (e.g., of the Ela-Elb and/or E3 
genes) that render them replication deficient (Fig. 2). These replica¬ 
tion-deficient adenoviruses are grown to high titers in 293 cells, a 
human embryonic kidney cell line, which is transformed with the ade¬ 
noviral El gene; this supplies El gene products in trans . 

The usefulness of recombinant adenoviral vectors for in vivo applica- 






Late Transcription 

Fig. 2. Structure of an adenoviral vector expressing human factor IX. By convention, 
the 36-kb adenoviral genome is divided into 100 map units. E denotes genes that are 
expressed before viral DNA replication (early). The vector, created by an in vivo recom¬ 
bination event, contains a F.IX cassette at the site of the El deletion. 
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tions has been limited by the immune response of the host organism to 
the virus and virally transduced cells; the immune response results in 
both limited duration of expression and a failure of expression follow¬ 
ing readministration of the vector (Yang et al. , 1994). Strategies pro¬ 
posed to circumvent this obstacle have included immunosuppression of 
the host (Dai et al 1995), engineering of the adenoviral vector to 
reduce expression of adenoviral proteins (Engelhardt et al., 1994), and 
the induction of tolerance in the host animal (Walter et al., 1996). 

C. Adenoassociated Viral Vectors 

Although there are not yet any data on the use of AAV vectors for 
gene transfer of F.VIII or F.IX, these vectors are noteworthy for several 
features that make them potentially useful as gene delivery vehicles. 
AAV is a single-stranded DNA virus that encodes two major genes, rep 
and cap , and includes inverted terminal repeat (ITR) sequences (Fig. 
3). The ITR sequences are required in cis to function as the origin of 
replication, and they also contain the sequences required for packag¬ 
ing, integration into the host cell genome, and rescue from recombi¬ 
nant plasmids (Hermonat et al., 1984; Samulski et al., 1987). The rep 
and cap genes are only required in trans for viral replication and en- 
capsidation, respectively. In a recombinant vector, the promoter and 
transgene cassette is placed between the ITRs and rep and cap are 
supplied in trans (on a different plasmid) during the generation of the 
recombinant virus. A helper virus, such as adenovirus, is needed for 
efficient AAV replication in vitro. Once generated, these vectors are 
replication defective because they lack rep and cap. Thus, recombinant 
AAV vectors have the advantage that coding sequences for viral gene 
products are not transferred to the host. After gaining entry into a host 





Fig. 3. AAV vector. (A) The genome of the wild-type AAV is —4.7 kb. The ITRs are 
— 145 bp long. (B) In the vector, the promoter and transgene cassette replaces the rep and 
cap genes, which are supplied in trans by a second plasmid during generation of the re¬ 
combinant vector. 
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cell, AAV can stably integrate its DNA via the ITRs and has been 
shown to transduce efficiently a wide range of human target cells (Cu¬ 
kor et al., 1984). In contrast to retroviruses and adenoviruses, AAV is 
not associated with any human disease (Berns et al., 1982). Wild-type 
AAV possesses the unique feature of integrating preferentially at a 
specific location on chromosome 19 (Kotin et al., 1990). Such a feature 
would be an advantage in a vector because it would decrease the likeli¬ 
hood of integration occurring at a potentially harmful site. However, 
conservation of this property would most likely require inclusion of 
additional AAV gene sequences in the recombinant vector. Thus far, 
low titers have been a problem. AAV vectors have not yet been used in 
any human trials but have been used successfully for gene transfer in 
rodents (Kaplitt et al., 1994) and rabbits (Flotte et aL, 1993). A clearer 
understanding of AAV biology, and technical advances in high-titer 
vector preparation, will be required for more extensive use of this 
vector. 


V. Factor IX 

Because the F.IX cDNA is only about 1.4-kb long (about 2.7 kb when 
the long 3' untranslated region is included), it fits the size constraints 
of all currently useful viral vectors. Much of the experience with in vivo 
gene transfer of F.IX (summarized in Table 1) has been with retroviral 
vectors. These vectors have been shown to express F.IX at high levels 
in cell culture (up to 3000 ng/ml) in various cell types. When trans¬ 
duced with vectors carrying either the human or canine F.IX cDNA, 
fully active F.IX was produced in primary culture human (Palmer et 
aL, 1989) and mouse (St. Louis and Verma, 1988) fibroblasts as well as 
in primary skin fibroblasts of hemophilic dogs (Axelrod et aL, 1990; 
Lozier et aL, 1994), primary rabbit hepatocytes (Armentano et al., 
1990), rat capillary endothelial cells (Yao et al., 1991), and mouse myo¬ 
blasts (Dai et al., 1992; Yao and Kurachi, 1992). This demonstrates 
that the necessary posttranslational modifications can be performed by 
a wide variety of target cells. However, successful gene therapy would 
require that cells transduced ex vivo also survive and express the trans¬ 
gene at therapeutic levels when transferred to a host animal. When 
Palmer et al. (1989) transduced human fibroblasts ex vivo and implant¬ 
ed these subcutaneously into nude mice, they achieved moderate levels 
(50-200 ng/ml) of active human F.IX, but levels declined to baseline 
after 4 weeks. Similar results were obtained by St. Louis and Verma 
(1988). Dai et al. (1992) reported long-term expression of biologically 



TABLE 1 

In Vivo Animal Trials of Gene Therapy for Hemophilia B 


Vector 

Target Cells 

Animal Model 

Maximal Levels 
of Active 

F.IX (%) 

Duration of 
Expression 

Reference 

Retrovirus 

Hapatocytes 

Hemophilic Dogs 

<0.2% 

>9 mos. 

Kay et al. (#35) 
Science '93 

Retrovirus 

Primary Myoblasts 

Nude Mice 

0.2% 

>6 mos. 

Dai et al. (#10) 

PNAS '92 

Adenovirus 

Hepatocytes 

Immunocompetent 

Mice 

-6% 

9 weeks 

Smith et al. (#46) 
Nature Genetics '93 

Adenovirus 

Hepatocytes 

Hemophilic Dogs 

300% 

4 ±8 weeks 

Kay et al. (#36) 

PNAS '94 

Adenovirus 

Hepatocytes 

Immunocompetent 

Mice 

-500% 

> 6 mos. 
with 2 
injections 

Walter et al. (#22) 
PNAS ’96 
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active F.IX from transduced primary myoblasts that were injected into 
hind legs of recipient mice. Although the expression was stable for over 
6 months, levels of F.IX expression were subtherapeutic at 10 ng/ml. In 
a different in vivo approach, Kay et al. (1993) infused a recombinant 
retrovirus expressing canine F.IX into the portal vein of hemophilic 
dogs that had undergone partial hepatectomy (to induce cell division in 
the remaining hepatocytes). The authors reported a modest shortening 
of the whole blood clotting time, but levels of F.IX expression (—10 
ng/ml) were still well below the therapeutic range. These results indi¬ 
cate that in vivo gene therapy with retroviral vectors is feasible, but 
also that levels of expression will have to be increased by at least an 
order of magnitude for possible human application. 

Adenoviral vectors have been used in dogs with hemophilia B to 
achieve high-level expression of canine F.IX with complete phenotypic 
correction, but expression is only transient and, on readministration of 
the vector, there is no expression, presumably because of the immune 
response to the vector. Kay et al. (1994) injected an adenoviral vector 
into the portal vein and achieved supraphysiologic levels (—33 gg/ml) 
of canine F.IX in the plasma. Levels fell steadily over a period of 1-2 
months and were well below the therapeutic range (<10 ng/ml) at that 
point. Readministration of vector was not specifically addressed in this 
report. In a subsequent paper, Dai et al. (1995) showed that intra¬ 
muscular administration of an adenoviral vector expressing canine 
FIX to nude mice resulted in long-term (>300 days), high-level expres¬ 
sion of the transgene, whereas injection into normal mice resulted in 
only 7-10 days’ expression. Normal mice could not be reinjected be¬ 
cause of neutralizing antibodies to adenovirus. Co-administration of 
the immunosuppressive drug cyclosporin A resulted in longer-term 
expression. Several strategies to circumvent the obstacles to reinjec¬ 
tion have been proposed and investigated. Walter et al. (1996) have 
taken advantage of the immaturity of the newborn immune system to 
achieve successful repeat administration of an adenoviral vector ex¬ 
pressing human F.IX. Normal adult mice injected intravenously with 
the vector showed supraphysiologic levels of expression, which de¬ 
clined over a period of 12-16 weeks to zero; these mice could not 
successfully be reinjected. However, newborn mice injected on day one 
of life also showed high levels of expression, which declined over a 
period of 12-16 weeks, but when these mice were reinjected with the 
same vector, high-level expression was again achieved. Whether third 
or subsequent injections will be possible is not yet clear, but the data 
suggest that early exposure may alter the immune response. Other 
investigators have used immunomodulatory agents to achieve expres¬ 
sion following repeat administration (Yang et al., 1995) or to achieve 
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longer-term expression from a single administration of an adenoviral 
vector (Kay et al ., 1995). Neither of these strategies has yet been tried 
with vectors expressing coagulation factors. 


VI. Factor VIII 

Experience with F.VTII gene transfer is more limited than that with 
F.IX. This is because of several disadvantages of F.VTII compared to 
F.IX. First, the full-length F.VIII cDNA is too large (9.0 kb) to be ac¬ 
commodated in many currently available vectors. However, this disad¬ 
vantage can be overcome by deleting the nonessential B domain, which 
renders the F.VIII cDNA small enough for most vectors. Second, F.VIII 
is a considerably larger molecule than F.IX and the half-life of human 
F.VIII in human plasma is only 10-12 h (only about 2.5 h without von 
Willebrand factor (vWF) (Tuddenham et al ., 1982). In addition, it has 
been shown by several groups (Dwarki et al., 1995; Hoeben et al ., 1992) 
that the half-life of human F.VTII in murine plasma is even shorter 
(less than 1 h) making it even harder to achieve detectable levels of 
human F.VIII in murine model systems. (The recent availability of a 
knock-out mouse model of hemophilia A may solve this experimental 
problem) (Bi et al., 1995). Finally, vWF is required to stabilize F.VTII in 
plasma because it acts as a carrier protein that protects F.VIII from 
proteolysis in vivo . Thus, the expressed F.VIII must gain easy and 
rapid access to the circulation to bind to vWF. 

Several groups have successfully used recombinant retroviruses for 
gene transfer of F.VIII. In in vitro experiments, Hoeben et al. (1990) 
have achieved moderate levels (35-125 mU/10 6 cells for 24 hours) of 
functional human F.VTII in murine fibroblast cell lines as well as in 
primary human skin fibroblasts. These vectors contained a partially 
deleted F.VTII cDNA (B domain deleted) driven by the retroviral long 
terminal report (LTR). When the transduced cells were transplanted 
subcutaneously into nude mice, no human F.VIII antigen was detected, 
although the cells seemed to survive (Hoeben et al., 1992). It was 
unclear whether the failure to detect F.VTII antigen after implantation 
was caused by the short half-life of human F.VIII in murine plasma, 
insufficient vascularization of the implant, inadequate transport of 
F.VIII into the circulation, or specific inactivation of the virally derived 
transcripts (Challita and Kohn, 1994). 

Dwarki et al. (1995), using the retroviral vector to express a B do- 
main-deleted F.VIII construct achieved therapeutic levels of active 
F.VIII in nude mice. When this group transduced primary human fi¬ 
broblasts in vitro and implanted them into the peritoneal cavity of 
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nude mice, they detected levels of F.VIII as high as 100 ng/ml for about 
10 days after which the levels fell to zero. Both cell death and extinc¬ 
tion of the LTR-driven mRNA expression may have played a role in the 
limited duration of expression. Interestingly, when transduced C2C12 
cells expressing F.VIII were injected intramuscularly into nude mice, 
no F.VIII was detectable. This is in contrast to the same experiments 
done with FIX, where FIX can be detected in the plasma of nude mice 
for prolonged periods (>100 days) at levels of 10 ng/ml (Dai et al., 
1992). More recently, Kaleko and coworkers have achieved sustained 
levels of Factor VIII in the plasma of C57/B16 mice using an adenovi¬ 
ral vector (Connelly et al., 1996). These investigators have shown that 
inclusion of genomic elements in the adenoviral construct results in 
higher level expression, allowing for administration of lower doses of 
vector. Duration of expression is longer with lower doses, presumably 
due to a less vigorous immune response. 

In a different, nonviral approach, Zatloukal et al. (1994) used ade- 
noviral-polylysine-transferrin-DNA complexes to express human 
F.VIII in mice. In this gene delivery method, pioneered by Birnstiel 
and colleagues (Cotten et al., 1992), polylysine serves to condense the 
DNA, and transferrin serves as a ligand that allows the complex to 
bind to receptors on the surface of the target cell, leading to endo- 
cytosis of the DNA-containing complex. Inclusion of (ultraviolet-inacti¬ 
vated) replication-defective adenovirus in the complex results in 100- 
to 1000-fold enhancement of gene expression, presumably because of 
disruption of the endosome and resulting improved delivery of DNA to 
the target cell nucleus. In the report by Zatloukal et al., murine fi¬ 
broblasts and myoblasts were “transferrinfected” in vitro and then 
injected into the spleens of mice. F.VIII levels detected in the plasma 
were around 8-17% of normal at 24 hours after injection, but declined 
to undetectable levels at 48 hours. The reasons for this rapid decline in 
expression are not known. When the same cells were injected into 
skeletal muscle, however, no expression was seen, which is in accord 
with the report by Dwarki et al. (1995). 


VII. Summary 

* 

There are many lines of evidence that suggest the eventual success 
of gene therapy as a treatment strategy for hemophilia. Because cur¬ 
rent treatment protocols using plasma-derived or recombinant pro¬ 
teins are far from ideal, the safe and efficient substitution of the defec¬ 
tive gene by a normal copy of the gene, or at least its addition, would be 
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of great benefit to the patient and may even be a potential cure. How¬ 
ever, the construction of efficient gene therapy vehicles has proven 
quite difficult in the past and, so far, there is no system that promises 
to have all the desired features without any serious disadvantages. In 
general, either the levels of transgene expression are too low (because 
of the low titers achieved during the generation of the virus) or short¬ 
lived (e.g., because of the specific shut-off of the transferred promoter) 
as is often seen with retroviruses, or in the case of adenoviral vectors, 
expression is limited because of a strong immune response of the host. 
Clearly, much work remains to be done to optimize these promising 
though still imperfect vector systems. In the case of adenovirus, the 
development of less immunogenic vectors or in vivo modulation of the 
host immune system may hold promise for improvements. Reports by 
Yang et al. (1995) and Kay et al. (1995) are promising steps in the 
direction of immunomodulation. Both attenuate the immune reaction 
to the adenoviral vector by simultaneous application of either an inter¬ 
leukin or an immunoglobulin, respectively. When IL-2 was adminis¬ 
tered, the amounts of IgA were reduced and successful administration 
of a second dose of virus was possible. When CTLA4-Ig, an immu¬ 
noglobulin that blocks the second signal during antigen presentation, 
was administered, a markedly prolonged expression of the transgene 
resulted. In vivo trials with AAV vectors have been carried out for some 
diseases (Flotte et al, 1993; Kaplitt et al, 1994) but not for hemophilia. 
Advances in high-titer AAV vector preparation will make this ap¬ 
proach more feasible. The pace continues to quicken in the develop¬ 
ment of nonviral modes of gene delivery (Perales et al, 1994). Although 
these results are encouraging for the future of gene therapy as a treat¬ 
ment for genetic diseases, much work remains to be done to make this 
potential alternative a reality for treatment of hemophilia. 
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L Introduction 

Retroviruses under normal circumstances infect cells and yet inflict 
relatively little damage to the infected cell. The cell, however, remains 
infected for the lifespan of that cell or any daughter cells derived from 
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that infected cell. Retroviruses that belong to the lentivirus subgroup 
are an exception because these viruses typically can directly kill the 
infected cell. The types of diseases induced by lentiviruses are also 
somewhat different presumably, at least in part, because of additional 
genes encoded in these complex retroviruses. Some retrovirus and ret¬ 
rovirus-like sequences are present in germline cells and are referred to 
as endogenous retroviruses. These sequences usually are nonexpressed 
or are expressed at very low levels and are discussed in this article. 

A wide range of biologic effects are discerned from a variety of retro¬ 
virus infections from malignancies, benign proliferative states that can 
lead to malignancies, degenerative diseases, viremias, and insertional 
activation or inactivations of cellular genes and transactivations of cel¬ 
lular genes. These activities can result from normal replication of these 
retroviruses or the less frequent generation of replication-defective 
aeiruses that can arise during persistent infections. These replication- 
defective viruses are probably not normally transmitted from host to 
host but have profound implications in the host in which they arise. 

“Genes Involved in Oncogenesis’' reviews the role that retroviral 
genes play in the development of malignancies and the reader is re¬ 
ferred there for an excellent coverage of this topic. This article dis¬ 
cusses those retroviral diseases and what is known concerning the 
genes and controlling regions that have been shown to influence the 
development of the broad category of nonmalignant diseases. 


II. Retrovirus Structure and Replication 

The reader is referred to several comprehensive reviews of the de¬ 
tails of both retrovirus structure and replication (Coffin, 1996). This 
review summarizes the available information concerning retrovirus 
gene structure and replication of exogenous retroviruses. The retro¬ 
virus family was previously subdivided into three main groups: the 
oncoviruses, the lentiviruses, and the spumaviruses. However, the In¬ 
ternational Committee on the Taxonomy of Viruses currently recog¬ 
nizes seven genera of retroviruses (Coffin, 1992) with the likely possi¬ 
bility of additional genera being added as a result of recently described 
retroviruses of fish and insects. The seven genera include the avian 
leukosis-sarcoma virus group, the mammalian C-type virus group, the 
B-type virus group, the D-type virus group, the human T-cell leukemia 
virus (HTLV)-bovine leukemia virus (BLV) group, the lentivirus 
group, and the spumavirus group. Table 1 lists the retrovirus genera— 
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recognized as of 1996—with examples of viruses as well as some of the 
currently unclassified retroviruses. Table 1 also includes some biologic 
features of some of these retroviruses along with the designation of 
exogenous (transmitted horizontally) or endogenous (transmitted from 
parent to offspring as a provirus integrated into the germ line). 

The genomes of all retroviruses have some common features. Figure 
1 demonstrates the basic features of the approximately 8 to 12 kilobase 
(kb) retrovirus ribonucleic acid (RNA). Two copies of this RNA are 
packaged per infectious virus particle. Features of this RNA include a 
5' cap structure, a 3' polyadenylate tail, a short repeat (R) sequence at 
both ends, a transfer RNA (tRNA) primer binding (PB) site, a splice 
donor (D) site and at least one (but can be more) splice acceptor (A) 
site(s), an encapsidation site required for packaging of viral RNA into 
virus particles, and a U5 region, which contains unique information 
and is the first region copied into DNA during reverse transcription. 
The U5 region then becomes the 3' end of the long terminal repeat 
(LTR). The U3 becomes the 5' end of the LTR and contains a number of 
ds-acting signals necessary for virus replication. The U3 region con¬ 
tains signals that can be recognized by the cellular transcriptional 
machinery and is an important determinant of viral gene expression 
and, therefore, is known to play an important role in many retrovirus- 
induced disease processes. The four coding regions of a generic retro¬ 
virus include the gag , pro, pol, and enu genes. The gag gene region 
encodes the group-specific antigens. The gag gene is translated from 
the full-length viral RNA to yield a polyprotein that is processed to 
yield three to five capsid proteins depending on the retrovirus (mini¬ 
mally, the matrix, capsid, and nucleic acid binding protein). The pro 
gene encodes a protease that processes the gag and pol polyproteins. 
The pol gene encodes the reverse transcriptase (RT) and an integrase 
(IN) protein. In addition, some retroviruses (equine infectious anemia 
virus, feline immunodeficiency virus, Mason-Pfizer monkey virus) have 
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Fig. 1 . Schematic diagram of the gene coding regions and important ds-acting signals 
of retroviruses. These features are present on all retroviruses. 



TABLE 1 

List of Retrovirus Groups 


Genus 


Avian leukosis-sarcoma virus group 


Mammalian C-type virus group 


B-type viruses 


D-type viruses 


Virus 


Properties 


Rous sarcoma virus 
Avian myeloblastosis virus 
Avian erythroblastosis virus 
Avian myelocytomatosis virus 
Rous-associated virus 
Rous-associated virus-0 


Exogenous, src oncogene 
Exogenous, myb oncogene 
Exogenous, erb-A and -B oncogene 
Exogenous, myc oncogene 
Exogenous, B-lymphomas, osteopetrosis 
Endogenous, no known disease 


Moloney murine leukemia virus 
Harvey murine sarcoma virus 
Abelson murine leukemia virus 
Feline leukemia virus 
Simian sarcoma virus 
Reticuloendotheliosis virus 
Spleen necrosis virus 

Mouse mammary tumor virus 


Mason-Pfizer monkey virus 
Simian AIDS viruses 


Exogenous, T-cell lymphomas 
Exogenous, H-ras oncogene 
Exogenous, abl oncogene 

Exogenous, T-cell lymphoma, immunodeficiency 
Exogenous, sis oncogene 
Exogenous viruses of birds 


Endogenous and exogenous, milk-borne, 
mammary carcinoma, 

T lymphoma 

Exogenous 

Immunodeficiencies in monkeys 



HTLV-BLV group 
Lentivirus group 


Spumavirus group 

Unclassified retroviruses 


Human T-cell leukemia virus 1 and 2 
Bovine leukemia virus 

Human immunodeficiency virus 1 and 2 
Simian immunodeficiency virus 

Feline immunodeficiency virus 

Bovine immunodeficiency virus 
Visna/maedi-ovine progressive pneumonia virus 

Caprine arthritis-encephalitis virus 
Equine infectious anemia virus 

Simian foamy virus 
Human foamy virus 
Feline syncytium-forming virus 

Walleye dermal sarcoma virus 
Damselfish neurofibromatosis virus 

Russell’s viper retrovirus 
Corn snake retrovirus 
Retrovirus of molluscs 


T-cell lymphoma, neurologic disease 
B-cell lymphoma 

Acquired immunodeficiency syndrome 
Usually no disease in natural primate host and 
AIDS-like disease in other primates 
Immune disorders, ocular abnormalities, gin¬ 
givitis 

Exogenous, possibly benign 

Neurologic or lung disease in sheep (occasionally 
goats) 

Encephalitis in young, arthritis in adults 
Periodic episodes of fever, anemia, throm¬ 
bocytopenia 

Exogenous, possibly benign 
Exogenous, possibly benign 
Exogenous, possibly benign 

Exogenous, benign dermal sarcomas of fish 
Exogenous, Schwann-cell neoplasms in dam¬ 
selfish 

Probably endogenous, probably benign 
Associated with neoplasms 
Associated with sarcomas in clams 
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a deoxyuridine triphosphatase (dUTPase) encoded in the pol gene re¬ 
gion. The pro and pol genes are not strictly considered separate genes 
because in the case of the pro region it is either encoded as a portion of 
the gag or pol region depending on the retrovirus and the pol region is 
encoded on the same messenger RNA (mRNA) as the gag gene. The env 
gene encodes two envelope glycoproteins that are cleaved from a pre¬ 
cursor protein. The surface (SU) protein binds to cell-surface receptors 
and mediates attachment of virus particles to cells and the trans¬ 
membrane (TM) protein anchors the SU protein to the lipid envelope of 
the virus particle. The env gene is translated from a singly spliced 
mRNA, unlike the gag, pro , and pol gene products. 

III. Additional Retrovirus Genes 

Retroviruses belonging to the murine leukemia-related and avian 
leukemia-related virus groups require only the gag, pro, pol , and env 
genes for all aspects of their replication. However, the other retrovirus 
groups encode a variety of additional gene products (Fig. 2) that are 
either necessary for replication or important for modifying the pattern 
of viral (or perhaps cellular) gene expression. The mammalian B-type 
viruses, like mouse mammary tumor virus (MMTV), encode a gene 
designated the sag gene for super antigen. MMTV super antigen is ex¬ 
pressed on the surface of infected cells and activates a subpopulation of 
CD4+ T cells. The sag gene may play an important role in ensuring 
efficient transmission of MMTV via milk to neonates possibly by re¬ 
cruitment of susceptible T cells to mammary tissue (Golovkina et aL, 
1992; Held et aL , 1993). 

HTLV and BLV encode two additional genes at their 3' ends from 
doubly spliced mRNAs. The 3' coding region of these viruses were 
originally designated the X-gene region. The tax gene is named for the 
transactivator from the X-gene region. The tax gene product, a 40,000 
dalton protein is required for efficient promoter activity from the viral 
LTRs and is therefore essential for replication. The rex gene is desig- 
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Fig. 2. Diagram displaying the coding regions of each of the seven representative 

retrovirus groups. The open boxes indicate the coding regions of the gag, pro, pol, or env 
gene regions of each respective virus. The solid boxes indicate either regulatory or 
accessory genes that are encoded by these selected retroviruses. Note that the complex¬ 
ity (number of regulatory and accessory genes) increases from top to bottom in this 
listing. 
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nated as a regulator of RNA splicing from the X region. The rex gene 
product is a 27,000 dalton protein that is required for the synthesis of 
full-length and env mRNAs. When the rex is not expressed, only tax 
and rex mRNAs are produced and therefore no virus replication can 
occur. 

The lentiviruses have a complex pattern of additional genes that, in 
the case of the human immunodeficiency virus (HIV), number six that 
have been identified as of 1996. Some of the lentiviruses have predicted 
open reading frames that have yet to be determined if they are ex¬ 
pressed or their functions determined. It is also clear that not all len¬ 
tiviruses have all of the additional genes contained in the HIV genome. 
For example, equine infectious anemia virus has only three additional 
open reading frames besides the gag, pol , and env genes designated tat , 
rev , and S2. However, by alternate splicing, at least one additional 
protein, ttm (truncated transmembrane), is encoded by combining a 
portion of the tat and env genes. 

The tat gene has a transactivating function that can significantly 
stimulate transcription of the HIV LTR by increasing transcriptional 
initiation or elongation (Peterlin et ai, 1993). In HIV, tat requires a ex¬ 
acting RNA element, designated the TAR element. The TAR element 
forms a stem-loop structure located at the 5' end of viral RNA tran¬ 
scripts. Genes with a tat or tatf-like function have been identified for 
most lentiviruses. Although tat is essential for efficient replication of 
lentiviruses, the role of tat in the pathogenesis of lentivirus disease is 
not clearly defined. A number of studies have demonstrated that HIV-1 
tat has the capacity to activate promoters of some cellular genes and 
heterologous viruses. Kaposi’s sarcoma-like lesions have been observed 
in some mice made transgenic for tat gene expression (Vogel et aL, 
1988) suggesting a possible link between this gene and the human 
disease. 

The rev gene performs a function analogous to the rex gene of HTLV 
and BLV. The rev gene regulates splicing and transport of viral RNA 
from the nucleus to the cytoplasm (Parslow, 1993). When the rev gene 
product is present in infected cells, the production of full-genome- 
length viral RNAs is greatly favored over subgenomic spliced viral 
mRNAs. 

The vif (viral infectivity factor) gene is present in HIV-1 and in most 
of the other lentiviruses, including HIV-2, simian immunodeficiency 
virus (SIV), and feline immunodeficiency and visna viruses. The vif 
gene product has been shown to significantly enhance the infectivity of 
HIV-1 virus particles. This gene has been shown to be essential for 
establishing a productive HIV-1 infection in peripheral blood T lym- 
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phocytes as well as monocytes and macrophages. Although it is not 
currently known how the vif gene product performs its functions, it is 
known that the protein is a cytoplasmic protein that tightly associates 
with cell membranes (Goncalves et al., 1995). It has been speculated 
that vif may counteract a cell-specific inhibitory factor(s) that prevents 
HIV-1 particle assembly (Michaels et al., 1993). 

The vpr gene product of HIV-1, HIV-2, and SIV is a virion-associated 
protein that is believed to play a role in facilitating entry of the viral 
core into the nucleus of nondividing cells. In addition, vpr prevents the 
establishment of chronically HIV producer cells by arresting the infec¬ 
ted cells in the G2/M phase of the cell cycle. This cell-cycle arrest in G2 
is brought about by a upr-induced inhibition of the activation of the 
p34edc2/cyclin B complex (He et aL, 1995). It is speculated that the 
cell-cycle arrest might delay or prevent apoptosis of infected cells 
thereby allowing the infected cells to produce more virus (Hoch et al., 
1995). A vpr or vpx mutant of SIV still induces AIDS in Rhesus mon¬ 
keys; however, an SIV double vpr / vpx deletion mutant was highly 
attenuated on an SIV mac 239 virus (Gibbs et al., 1995). The vpx gene is 
a homolog of vpr that is present in various SIV strains and in HIV-2. 
The authors of these studies suggested that, because vpr and vpx are 
related genes that may have overlapping or duplicative functions, dele¬ 
tion of both of these virion-associated gene products may completely 
eliminate their overlapping functions and result in less virus growth in 
vivo and result in virus attenuation. 

The vpu gene of HIV-1 is associated with two known functions. The 
first function is an induction of CD4 protein (HIV-cell receptor) degra¬ 
dation in the endoplasmic reticulum. The second function that has 
been recently described is a monovalent cation channel activity when 
expressed in Xenopus oocytes (Ewart et al., 1996). The vpu protein is 
an oligomeric, integral membrane protein that contains a hydrophobic 
transmembrane domain and a polar phosphorylated cytoplasmic tail. 

The nef gene of HIV-1, HIV-2, and SIV encodes a 27-34 kDa myr- 
istylated, inner plasma membrane protein that is not required for vi¬ 
rus replication. Experimental infections of rhesus macaques with nef 
gene mutants of SIV have shown that nef is required for the mainte¬ 
nance of high viral loads and progression to AIDS. The nef gene has 
also been shown to downregulate the expression of CD4 molecules on 
the surface of cells by mediating the internalization and subsequent 
degradation of the receptor in the lysosomes (Michael et al., 1995; 
Kestler et al., 1991). The nef gene also has been shown to enhance the 
infectivity of HIV-1 virions. This enhancement of infectivity appears 
to be mediated at the level of altering the processing of the inter- 
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nalized virus core, making it more competent for viral deoxyribonu¬ 
cleic acid (DNA) synthesis (Aiken and Trono, 1995). There is also evi¬ 
dence that nefis capable of both enhancement and inhibition of T-cell 
activation. This may be a consequence of the release of the p561ck 
tyrosine kinase from the CD4 cytoplasmic domain (Aiken et aL, 1994; 
Rhee and Marsh, 1994). 


IV* Feline Acquired Immunodeficiency Syndrome 

Feline leukemia virus (FeLV) infection can result in either neoplastic 
or a variety of nonneoplastic diseases. The neoplastic diseases are clas¬ 
sified based on the cell type that has undergone malignant transfor¬ 
mation and the location of the primary lesion. The wide variety of 
nonneoplastic diseases has made diagnosis of many of these diseases 
very difficult. These nonneoplastic diseases can be confused with other 
feline diseases. Diseases such as a nonregenerative anemia, hemolytic 
anemia, thymic atrophy, a panleukopenia-like syndrome, glomerulo¬ 
nephritis, reproductive disorders, and immunosuppression (feline ac¬ 
quired immunodeficiency syndrome, FAIDS) (reviewed in Hardy, 1993). 

A. FeLV Subgroups 

There are three subgroups of FeLV, which are designated A, B, and 
C. These subgroups are based on the property of superinfection inter¬ 
ference in which two A-type viruses will interfere with each other when 
infecting a population of cells previously infected with an A-type virus. 
For example, no interference is exhibited between an A- and B- or A- 
and C-type virus. The A subgroup FeLV is the most common subgroup 
isolated from domestic cats. This virus replicates to high titers in infec¬ 
ted cats and is always present when the B and C subgroup viruses are 
isolated. The subgroup type determines the type of disease that can be 
induced. For example, the Rickard strain of FeLV is a combination of A 
and B virus types, which induces a high incidence of persistent viremia 
and severe immunosuppression by 20-30 weeks postinfection. A sub¬ 
group A infection alone (Glasgow strain of Rickard) results in a low 
incidence of persistent viremia, hemorrhagic enteritis, and neutrope¬ 
nia. The subgroup C virus is found in plasma after a delay and may 
well be replication defective. The Kawakami-Theilen strain of FeLV is 
a mixture of A, B, and C viruses and causes severe erythrosuppresion 
and death rapidly (within 9 weeks) postinoculation of newborn kittens. 
Surprisingly, this virus is nonpathogenic in kittens older than 6 months 
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of age and adults. The Sarma strain is a subgroup C virus that induces 
a viremia and erythroid aplasia in infected kittens. 

In FeLV-infected cats, immunosuppression coincides with the onset 
of bone marrow cell infection and precedes detectable neoplastic trans¬ 
formation. During this time, infected cats are susceptible to a variety 
of opportunistic infections. Diseases such as enteritis, gingivitis, bacter¬ 
emias, pneumonia, feline infectious peritonitis, Hemobartonella felis 
infections, or a number of parasitic infections. Kristal et al. (1993) 
demonstrated that the FeLV surface protein gp70 determines the sub¬ 
group interference type (A, B, or C), the T-cell killing properties of the 
virus, the host range specificities, as well as the growth kinetic proper¬ 
ties. This group’s FeLV-FAIDS-EECC chimeric clone causes T-cell kill¬ 
ing presumably because of efficient replication associated with massive 
superinfection. The superinfection and the cell killing can be inhibited 
by adding virus neutralizing antiserum to block superinfection. 


V. Murine AIDS 

Some inbred strains of mice when infected with the LP-BM5 strain 
of murine leukemia virus (MuLV) develop a severe immunodeficiency 
that shares some similarities with human AIDS. Clinical features in¬ 
cluding lymphadenopathy, profound immunodeficiency of both the hu¬ 
moral and cellular immunity, splenomegaly, and hypergammaglobu¬ 
linemia are displayed. B-cell lymphomas can develop in the late stage 
of this disease process. The LP-BM5 strain of MuLV is actually a mix¬ 
ture of ecotropic, polytropic, and defective MuLV viruses. The defective 
MuLV encodes an unusual 60-kDa protein related to gag that is proba¬ 
bly responsible for the immunosuppression. More recent data indicates 
that the defective gag-related protein may have superantigen proper¬ 
ties that result in a selective expansion of certain T cells. These proper¬ 
ties may misdirect the immune response leading to an inability to 
mount an effective immune response against a potential pathogen 
(Mosier et al,, 1985; Klinken et al,, 1988). 


VI. Simian AIDS 

The Simian type-D retroviruses (SRV) are a group of exogenous vi¬ 
ruses that are indigenous to Asian macaques. There are also related 
type-D endogenous retroviruses found in simians. Only the exogenous 
viruses have been shown to be pathogenic. The exogenous type-D vi- 
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ruses are nononcogenic, horizontally transmitted, and have the capaci¬ 
ty to cause a fatal immunodeficiency syndrome in macaques. SRVs are 
not closely related to HIV or to SIV. SRV infections in primate centers 
have been shown to be important causes of fatal immunosuppressive 
diseases in Asian macaques. For example, in 1983—84 at the California 
Primate Center, nearly all of the adult macaques in a breeding colony 
were infected with either SRV-1 or SRV-2 and the mortality was close 
to 50% in macaques less than 2 years of age. The prototype virus of this 
group, Mason-Pfizer monkey virus (MPMV) was isolated in 1970 from 
a rhesus monkey mammary tumor. Experimental infection of infant 
rhesus monkeys with MPMV results in a wasting syndrome with ane¬ 
mia, thymic atrophy, neutropenia, and opportunistic infections (re¬ 
viewed in Gardner et al., 1994). 


VII. Murine Leukemia Virus 

A. Spleen Focus-Forming Virus 

There are some strains of MuLV, designated the Friend strains, that 
have the capacity to induce an abnormal proliferation of erythroblasts 
in infected mice. Although the nondefective MuLV possesses the capac¬ 
ity to induce this condition alone, it has been frequently observed that 
a replication-incompetent defective virus known as spleen focus-form¬ 
ing virus is a potent inducer of the erythroblastosis. Infected mice 
displaying the erythroblastosis condition are predisposed to developing 
a malignant erythroleukemia condition. It has been shown that a mod¬ 
ified MuLV envelope protein, p55, is responsible for the erythroblas¬ 
tosis condition. Three modifications of the MuLV env gene are required 
for the conversion of this retroviral protein into a protein that can 
apparently function as a potent erythroid growth factor. The MuLV env 
gene recombines with an endogenous murine retrovirus env gene, re¬ 
sulting in an altered sequence, a portion of the surface domain of the 
envelope gene; the transmembrane region is deleted; and the mem¬ 
brane-spanning domain of the transmembrane region is extended 
(Ruscetti et al., 1990). 

B. MuLV Neurologic Disease 

Wild mice captured from the Lake Casitas region in California have 
a high incidence of hind-limb paralysis associated with elevated levels 
of MuLV expression (Gardner et al., 1973). An isolate of this MuLV- 
designated Cas-Br-E causes the same hind-limb paralysis when infect- 
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ing inbred strains of mice. The neurologic disease seems to be a conse¬ 
quence of high levels of virus replication in the central nervous system. 
Mice older than 10 days of age that are inoculated with this virus do 
not develop the disease and do not become viremic. Mapping of this 
phenotype with nonneurovirulent strains of MuLV indicate that the 
env gene is responsible for this property, although the LTR and some 
gag-pol sequences influence the rate of disease progression. The devel¬ 
opment of neurologic disease can take up to 12 months and the precise 
mechanism responsible for disease has not been defined. It has also 
been shown that a temperature-sensitive mutant of Moloney MuLV 
(MoMLV) and a rat-passaged variant of Friend MuLV have the capaci¬ 
ty to induce neurologic disease as well. The neurovirulence of the tem¬ 
perature-sensitive mutant of MoMLV has been mapped to the env gene 
like the Cas-Br-E strain of virus. The envelope proteins accumulate in 
the infected cell because of defective processing (Wong, 1983). 

C. Osteopetrosis 

Certain strains of avian leukemia virus (ALV) can induce osteo¬ 
petrosis in chickens. A proliferation of osteoblasts in the long bones of 
the leg leads to a very apparent thickening of the legs. The osteoblasts 
contain large amounts of unintegrated viral DNA (Robinson and Miles, 
1985) and high amounts of virus. Although the mechanism of this 
condition is not clear, it is likely that the unusually high amount of 
virus replication in this tissue site results in a disruption of the deli¬ 
cate balance between bone growth and resorption that normally oc¬ 
curs. Genetic mapping studies comparing viral chimeras between os¬ 
teopetrosis-inducing and nonosteopetrosis-inducing ALV strains of 
virus indicate that sequences in the gag gene region near the 5' LTR 
determine the ability to induce osteopetrosis. 

D. HTLV-1-Associated Myelopathy 

In 1985, an association between infection of a group of West Indian 
humans with a neurologic disease and infection with HTLV-1 was 
noted. The neurologic disease was identified as tropical spastic parapa¬ 
resis (TSP). A Japanese group also noted a group of humans with 
similar neurologic problems also with associated HTLV-1 infections 
(Osame et al., 1986; Akizuki et aL, 1987). This disease was designated 
HTLV-l-associated myelopathy (HAM). The pathogenesis of this neu¬ 
rologic disease is poorly understood and is under study (Furukawa et 
al., 1992). 
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E. Reticuloendotheliosis Virus 

Some strains of reticuloendotheliosis virus (REV) can cause a variety 
of nonneoplastic diseases in chickens and ducks. These diseases in¬ 
clude anemia (Kawamura et al ., 1976), abnormal feather development 
(Koyama et al., 1980), atrophy of the bursa and thymus (Mussman and 
Twiehaus, 1971), enteritis (McDougall et al., 1978), proventriculitis 
(Jackson et al., 1977), necrosis of the liver and spleen (Purchase and 
Witter, 1975), a runting disease syndrome (Mussman and Twiehaus, 
1971), and immunosuppression (Witter et al., 1979). In addition, REV 
can induce neoplastic conditions. The immunosuppressive properties 
of REV are thought to be responsible for secondary bacterial and viral 
infections. Cell-mediated immune responses are impaired because of 
infection with this virus. The genetic basis of these diseases is not 
currently known. 


VIII. Endogenous Retroviruses 

A. Mouse Mammary Tumor Virus 

The typical mouse contains approximately 1000 copies of endoge¬ 
nous retroviral sequences. There are at least three known cases in 
which genetic-based diseases have been caused by insertion of endoge¬ 
nous murine retrovirus sequences in inbred strains of mice. The hair¬ 
less mutation (Stoye et al., 1988), a coat-color mutation (Jenkins et al., 
1981), and a lethal collagen defect (Harbers et al., 1984). In addition, a 
mammalian B-type endogenous virus, MMTV, is a less common endog¬ 
enous murine virus that requires virally encoded superantigen (sag) 
gene to facilitate efficient transmission to neonates via the milk. Su¬ 
perantigens bind to MHC proteins and T-cell receptor protein variable 
regions and, as a consequence, they activate all T lymphocytes of a 
particular T-cell receptor class. The proper receptor bearing T cells will 
proliferate for a period of time but eventually will be clonally deleted 
from the T-cell population. MMTV superantigen from C3H mice causes 
T cells of the V-beta-14 class to proliferate. Transgenic mice from a 
background that do not normally express the C3H MMTV sag gene will 
delete the V-beta-14 class of T lymphocytes when made to express this 
gene. The elimination of T cells responsive to MMTV superantigen 
renders the mouse resistant to infection by that virus. Therefore, the 
sag gene is likely required in vivo for maintenance and transmission of 
this virus. 
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IX. Lentiviruses 


A. Human Immunodeficiency Virus 

Although it is not possible to directly test the influence of various 
HIV-1 and -2 genes on pathogenesis in humans (as compared to SIV), 
there have been some interesting studies done on humans infected 
with HIV-1 who have survived for 10 years or more and remain healthy 
with no signs of clinical progression to acquired immunodeficiency syn¬ 
drome (AIDS) and normal CD4 lymphocyte counts. In come cases, the 
virus from these long-term nonprogressors have been analyzed and 
found to contain some interesting and suggestive mutations. Deacon et 
al. (1995) analyzed the virus from a cohort (unrelated individuals) of 
transfusion recipients from a Bloodbank in Sydney, Australia. The 
HIV-1 infected blood donor and seven recipients of blood from the infec¬ 
ted individual had been infected for 10.755 to 14 years without any 
cohort member developing any AIDS- or HIV-related symptoms or re¬ 
ceiving any antiviral therapy. Proviral DNA was analyzed from seven 
individuals and it was discovered that all seven virus populations con¬ 
tain deletions of various sizes in the nef gene and the U3 LTR domain 
because there is some overlap between the nef gene and the U3 region. 
Similar results have been reported for another individual long-term 
nonprogressor infected with HIV-1 in which deletions in the same 
nef -LTR regions were observed (Kirchhoff et al, 1995). These studies 
are consistent with the SIV studies that indicated that the SIV nef 
gene played an essential role in the induction of disease (as discussed 
previously). 


B. Equine Infectious Anemia Virus 

Equine infectious anemia virus (ELAV) causes a persistent, life-long 
infection of horses characterized by periodic episodes of fever, throm¬ 
bocytopenia, and anemia, which if severe can result in death within 
the first two weeks of infection. A comparison of sequence variation 
between a highly virulent strain of EIAV and an attenuated strain of 
this virus reveals a large number of sequence differences that are 
predominantly in the U3 region of the LTRs and the surface protein 
sequence (gp90). It has been possible only since 1996 to obtain infec¬ 
tious molecular clones of this virus that promptly induce disease; it 
now should be feasible to determine the role of various viral genes and 
ds-acting sequences on the influence of disease expression. 
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C. Feline Immunodeficiency Virus 

The disease induced by feline immunodeficiency virus (FIV) appears 
in many ways to mimic the effects of the HIV-1- and -2-induced immu¬ 
nosuppressive disease seen in humans. Cats develop an acute infection 
syndrome with a mild fever and transient generalized lymphadenopa- 
thy. The acute phase of infection is followed by an asymptomatic stage 
when the CD4:CD8 cell ratio declines because of a progressive decrease 
in CD4 cells. The asymptomatic phase is followed by a phase in which a 
variety of clinical disorders occur that closely resemble those in hu¬ 
mans with AIDS. Clinical signs in cats include a variety of opportunis¬ 
tic infections, chronic gingivitis and stomatitis, chronic upper respira¬ 
tory infections, as well as superficial and systemic fungal infections 
(Pedersen et al 1989). While the virus by itself can induce inversions 
of CD4 to CD8 lymphocytes in infected cats, this infection appears to 
most importantly reduce the efficiency of a Thl (T-cell-based) immune 
response. For example, cats that were experimentally infected with the 
NCSU strain of FIV and were subsequently challenged with a nor¬ 
mally nonlethal dose of the intracellular parasite Toxoplasma gondii 
succumbed to this dual infection. An analysis of the immune response 
during the T. gondii infection, as shown in Fig. 3, indicated that the 
FIV-infected cats failed to generate an IL 12-dependent TVpe-l immune 
response to T gondii infection but rather responded with an ineffective 
IL4-induced Type-2 response (Levy et al., personal communication). 
Because resistance to T. gondii depends primarily on a Thl-immune 
response in cats, FIV-infected cats were unable to respond to the para¬ 
site infection with a good Thl response. As a result, the FIV-infected 
cats developed a severe generalized toxoplasmosis resulting in acute 
and often fatal interstitial pneumonia. In contrast, cats infected only 
with T. gondii developed a transient, mild clinical disease charac¬ 
terized by anorexia, lethargy, and multifocal chorioretinitis. 


D. Visna, Maedi, Progressive Penumonla Virus 

The most common disease manifestation of ovine lentivirus infected 
sheep is dyspnea and a wasting disease characterized by severe loss of 
body weight (Narayan and Cork, 1985). The disease was first described 
in epidemic form in Iceland by Sigurdsson after the introduction of 
infected sheep from Europe during the mid-1980s. The primary clinical 
manifestations included dyspnea and wasting in infected sheep. A neu¬ 
rologic complication (visna) of the virus infection was observed in the 
Icelandic breeds of sheep during the height of the epidemic in the 
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■ ■ ^Negative Signal (greatly enhanced in FIV infected cats and HTV 

infected humans) 

Fig. 3. Autocrine and paracrine cytokine regulation of the innate and T-cell-mediated 
immune response to T. gondii , The initial interaction of the macrophage (MP) with the 
pathogen (TOX) causes synthesis of IFNy and TNFa and their respective receptors (step 
1). The macrophage-derived IL12 in collaboration with TNFa, or IL1(3 (step 2) induces 
synthesis of IFNy by the NK cell, which becomes the major IFNy producer (step 3). NK- 
cell-derived IFNy combined with TNFa causes increased microbiocidal activiation of the 
intracellular macrophage functions (step 3) as well as an upregulation of 1L12 necessary 
to stimulate the Thl immune response (step 4). Antigen-presenting cells (APCs) stimu¬ 
late naive T cells (ThO) to differentiate into Thl (inflammatory-T) (step 5) or Th2 (helper- 
T) cells and, although the specific signals are not clear, the cytokines present at the time 
of antigen presentation play an important role. Cytokines that promote a Thl response 
include IL12, IFNy, IFNa, and TNFa. Once activated, Thl cells leave the lymph nodes, 
enter the bloodstream, and specifically migrate to regions of inflammation as directed by 
adhesion molecules where they perform their primary effector function of activating 
macrophages through the production of IFNy (step 6). IL10 is an important regulator of 
this process because a macrophage stimulated to produce IL10 will inhibit the synthesis 
of IL12 and therefore inhibit the Thl response (step 7) in FIV-infected cats and HIV- 
infected humans, IL10 produced from either macrophages, CD8 + T cells or Th2 cells 
downregulate IL12 production from macrophages. The decreased IL12 production great¬ 
ly inhibits the development of the Thl inflammatory T-cell response. Courtesy of Wayne 
B. Tompkins. 

1950s. The disease is characterized primarily by inflammation of spe¬ 
cific organ or tissue. Affected tissues are characterized by a marked 
infiltration of mononuclear cells to such an extent that normal tissue 
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architecture is disrupted and tissue and organ function is impaired. Af¬ 
fected tissues can include the lung, spleen, joints, lymph nodes, brain, 
and spinal cord where these infiltrates can be so extensive as to disrupt 
normal function. Further study is required to determine the role of 
viral gene expression in the development of this lentivirus infection. 

E. Caprine Arthritis Encephalitis Virus 

The most common clinical manifestation associated with caprine ar¬ 
thritis encephalitis virus (CAEV) infection in goats is synovitis in adult 
dairy goats. The synovitis has a gradual onset and slowly progresses to 
a crippling arthritis (Narayan and Cork, 1985). Newborn goats in 
herds of CAEV-infected goats can develop a rapidly progressing neuro¬ 
logic disease leading to paralysis and death. As in ovine lentivirus 
infection, CAEV-infected goats will display severe inflammation in af¬ 
fected tissues. Additional study is required to determine the role of 
viral gene expression in the development of disease for this lentivirus. 
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I, Introduction 

Advances in molecular genetic technology and a worldwide assault 
on mapping and sequencing the human genome have made genome 
research highly visible to the public and one of the exciting new fron¬ 
tiers in the life sciences. Mapping genes to discrete relative positions 
on animal chromosomes.is not a new concept among biologists, how¬ 
ever. Sturtevant (1913) successfully followed alleles through meiotic 
recombination in drosophila to order genes on the X-chromosome, thus 
introducing our current generation of animal biologists to the underly- 
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ing principles of genomic research. Unfortunately, the 50 years imme¬ 
diately following Sturtevant’s work produced only a scattering of genes 
mapped in a variety of animal species with the notable exception of the 
laboratory mouse. The availability of large numbers of inbred strains 
and a collection of mutant stocks produced a linkage map in Mus 
musculus, which became the prototypic mammalian genetic map. It 
has subsequently been enhanced by the availability of biochemical and 
molecular markers and the widespread use of interspecific hybrid 
backcrosses. The genomic revolution is most apparent, however, in the 
human genome project. It should not be surprising that the agri¬ 
cultural sciences, long concerned with germ plasm improvement, are 
beginning to invest in building maps and localizing important traits to 
chromosomal regions. Perhaps it is surprising that it has taken so long 
for animal geneticists, when compared to plant breeders and medical 
geneticists, to enthusiastically support the construction of genomic 
maps in economically important animals. Nonetheless, the genomic 
revolution has finally extended its borders to the study of animals used 
in agriculture and this article presents an overview of the progress and 
problems of genome analysis in those domestic animals. 


II. Why Map Farm Animal Genomes 

A. To Study Chromosomal Evolution 

The organization of extant genomes and their relationship to each 
other provides valuable insight into the organization of ancestral ge¬ 
nomes and how chromosome structure may have evolved. Comparative 
gene mapping, which is nothing more than mapping homologous genes 
in multiple species, provides important information about chromosom¬ 
al evolution between distant species that we cannot retrieve even with 
today’s most advanced cytogenetic technologies. Consequently, a driv¬ 
ing force in animal genome mapping is to be able to answer questions 
regarding chromosomal rearrangements that accompany evolution 
and thus to advance the fundamental biological knowledge of the spe¬ 
cies of interest and animals in general. Comparative mapping has 
become one of the major tools for the study of chromosomal evolution in 
animals. The well-funded and well-organized international human ge¬ 
nome initiative is well on the way to providing the standard mam¬ 
malian genomic map to which all others will undoubtedly be compared. 
The map of the laboratory mouse continues to advance at a rapid pace 
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and will undoubtedly continue to be the first yardstick for evaluating 
chromosomal conservation and genomic evolution. Other mammals, 
including the livestock species, have extremely important roles to play, 
however, in helping us understand the pathways of chromosomal re¬ 
arrangement that have accompanied mammalian evolution. It is im¬ 
portant to note in this context that all mammals have essentially the 
same amount of deoxyribonucleic acid (DNA) in their genomes and a 
similar complement of structural genes. Chromosome numbers, how¬ 
ever, differ markedly even within some mammalian orders, and band¬ 
ing-pattern divergence suggests that this homologous mammalian ge¬ 
nome has undergone various combinations of reciprocal chromosomal 
exchanges and internal rearrangements as taxa have diverged. It has 
already become clear through comparative mapping that some mam¬ 
mals, such as cattle and cats, have genomes that are more highly 
conserved relative to the human genome than is the most often com¬ 
pared mouse genome. These more highly conserved genomes may most 
nearly reflect the chromosomal arrangement of the ancestral mammal. 
At the minimum, they demonstrate that mammalian chromosome evo¬ 
lution cannot be fully represented by differences in the genomes of 
humans and mice. Continued comparative mapping including the ge¬ 
nomes of domestic animals will continue to make a valuable contribu¬ 
tion to understanding mammalian chromosomal evolution in a univer¬ 
sal context. 


B. To Permit Marker-Assisted Selection 

The practical goal of economic enhancement through marker-as¬ 
sisted selection (MAS) of desirable and marketable traits drives gene 
mapping in many laboratories and provides the motivation for several 
national programs that fund animal genomic research. Genetic mark¬ 
ers of advantageous alleles for economic trait loci (ETL), including 
quantitative trait loci (QTL), have the potential to enhance the rate 
and efficiency of genetic gain through selective breeding, a concept 
advanced well before the technical tools become available for its imple¬ 
mentation (Soller and Beckmann, 1982; Weller et aL, 1990; Smith and 
Simpson, 1986). The ideal marker for use in a selective breeding pro¬ 
gram would obviously be the gene actually responsible for the trait. 
While such markers are still rare, a few have been identified by relent¬ 
less searches for variation in genes thought to be involved in the physi¬ 
ological pathways leading to the phenotype of interest. This so-called 
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candidate gene approach to marker identification requires a sound 
fundamental knowledge of the physiological processes underlying the 
trait and then extensive and sometimes expensive screening for varia¬ 
tion in candidate genes related to these processes. Ideally, once se¬ 
quence variation related to the trait has been found, it can be incorpo¬ 
rated directly into the marker assay. As an excellent example of a 
successful candidate gene search, Shuster et al. (1992) identified the 
genetic defect responsible for leukocyte adhesion deficiency (LAD) in 
Holstein cattle as a missense mutation coding amino acid 128 in CD 18. 
The mutant and normal alleles were then distinguished by the poly¬ 
merase chain reaction (PCR) providing the ultimate genetic marker of 
this ETL. 

The physiological bases for most economic traits are unknown and 
consequently candidate genes are not obvious. Alternatively, the com¬ 
plexity of the trait often presents a long and unduly cumbersome list of 
candidate genes. ETL, even QTL, can be genetically mapped, however 
(Lander and Botstein, 1989; Paterson et al., 1989; Georges et al., 
1993a; Georges et al., 1993b; Andersson et al., 1994; Georges et al., 
1995). Markers mapped in close proximity to ETL might then be used 
to assist in selection for the ETL if the recombination frequency (map 
distance) is sufficiently small and the chromosomal phase of marker 
and ETL alleles is known. Efficiency of MAS can be increased by iden¬ 
tifying markers flanking the ETL because recombination in the region 
spanned by two markers can be detected. A major goal of animal gene 
mapping has therefore been to produce maps of highly polymorphic 
markers spaced at intervals of 20 centimorgans (cM) (IcM = 1% recom¬ 
bination) or less across every chromosome of the targeted species. 
These markers can then be applied to families segregating QTL, hope¬ 
fully resulting in the linkage associations of the QTL with one or more 
markers. Under appropriate breeding protocols, these markers may 
then be used for MAS. 

Linked markers can be used in selection only if the phase of the 
marker allele and the ETL allele is known. Phase refers to the associa¬ 
tion of trait alleles and marker alleles on a common parental chromo¬ 
some. In some gametes, for example, allele a of the marker locus might 
be physically associated with the positive ( + ) trait allele, whereas in 
another it might be associated with the negative (-) allele. This does 
not present a problem in defined pedigrees where the phase is initially 
known and map distances are small. Linked markers cannot be used in 
the absence of pedigree information, however, unless the marker and 
trait are in linkage disequilibrium. 
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C. To Facilitate Positional Cloning 

A third, and perhaps most important, goal of gene mapping is to 
identify and clone genes responsible for ETL. It is obvious from the 
previous discussion that MAS can be done much more efficiently with 
variation in the gene actually responsible for the ETL. Moreover, a 
complete understanding of the potential interaction of the trait with 
other physiological processes is possible only when we know the genes 
involved. The term reverse genetics has been replaced in common usage 
by positional cloning or map-based cloning to describe the process by 
which the application of map information is used to clone a gene re¬ 
sponsible for a specific trait in the absence of information about the 
biochemical or molecular basis of the trait. Success of positional clon¬ 
ing has been experienced a number of times in searches for human 
disease genes, perhaps best exemplified by the cloning of the gene for 
cystic fibrosis (Rommens et al. , 1989). While the task of positionally 
cloning genes in any species is a formidable one, cloning genes for ETL 
in livestock is almost prohibitive. Animal maps will most certainly 
never be as dense as those of the human. Large insert libraries for 
livestock species are only beginning to be developed and used. Chromo¬ 
somal deletions of economically important genes, important tools in 
many of the human and mouse successes, have not been identified and 
propagated in livestock. The task is further complicated by the quan¬ 
titative nature of most of the traits of economic interest in farm ani¬ 
mals and the paucity of research support worldwide for animal agricul¬ 
ture relative to the human genome initiative. Alternative strategies to 
conventional positional cloning must be planned and developed. One 
proposed approach is comparative candidate positional cloning, which 
takes advantage of knowledge of the evolutionary history of chromo¬ 
somes and rapid advances in the human and mouse maps. This concept 
will be further developed in Section V. 


III. Methods for Animal Genome Mapping 

A. Synteny Mapping 

Synteny simply means on the same strand or, in genetic terminology, 
on the same chromosome. A synteny map is nothing more than a list of 
genes known to reside on the same chromosome in a particular species. 
An unfortunate misuse of the word synteny abounds in recent genetic 
literature. Such statements as “a greater degree of synteny between 
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cattle and human than between cattle and mouse,” or “a limited degree 
of synteny has been demonstrated between human, mouse and swine” 
appear repeatedly in genetic literature and are meaningless in genet¬ 
ics. “Conserved synteny” was used by Nadeau (1989) to describe the 
location of two or more homologous genes on the same chromosome in 
different species. “Synteny” should not be substituted for “conserved 
synteny” in our comparison of maps between species. Synteny mapping 
is probably associated with comparative mapping because the only 
maps available for comparison between most animal species have 
heretofore been synteny maps. 

The current revolution in human gene mapping began with the 
building of synteny maps, the success of which can be traced to the 
hybridization of cells from different species (Weiss and Ephrussi, 1966) 
and to the subsequent elimination of human chromosomes in the pro¬ 
liferation of hybrid cells (Ruddle, 1972). The assignment of genes to 
syntenic groups was accomplished by observing the concordant loss or 
retention of gene products. If two genes were lost or retained concor- 
dantly in the random segregation of human chromosomes, they were 
assumed to be located on a common chromosome and said to be syn¬ 
tenic. In the early days of somatic cell genetics, electrophoretic differ¬ 
ences in homologous gene products of the progenitor species allowed 
the identification of human specific gene products and subsequently 
the mapping of the human genes encoding them. Cytogenetic analysis 
of hybrid clones then permitted the correlation of a particular gene 
product with a specific human chromosome or chromosome fragment 
and thus the assignment of that gene to a chromosome or chromosomal 
region. These methods with numerous refinements have been thor¬ 
oughly reviewed (Ruddle and Creagan, 1975; Shows, 1977; Shows et 
al., 1982). Somatic cell genetics is a powerful mapping tool because it 
circumvents both the requirements of large numbers of offspring from 
sexual matings and genetic variation within a species. Because of the 
evolutionary divergence of such species as humans and mice, the prob¬ 
ability of finding differences in their homologous genes or gene prod¬ 
ucts is much greater than the probability of finding variation within a 
species. 

Whereas cell hybridization facilitated by Sendai virus or poly¬ 
ethylene glycol has been the principal method for transferring donor 
chromosomes into recipient cells, other methods were developed for 
specific experimental strategies (Ruddle, 1981). The isolation of donor 
chromosomes into microcells (Fournier and Ruddle, 1977) provides ex¬ 
perimental control over chromosome segregation and allows the trans¬ 
fer of one or a few intact donor chromosomes. Recipient cells can also 
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be transformed with cell-free preparations of donor chromosomes, 
which are degraded into chromosomal fragments after endocytosis 
(Klobutcher and Ruddle, 1981). Eukaryotic cells can also be trans¬ 
formed by purified DNA (Wigler et al ., 1977), either by endocytotic 
uptake of calcium precipitated DNA or by injection with micropipets. 
The size of the donor DNA molecule can be specifically controlled in 
these experiments and is often 50 kilobases (kb) or smaller. Each of 
these methods provides a means of transferring progressively smaller 
segments of donor chromosome into recipient cells and consequently 
progressively extending the limit of refinement of gene mapping by 
parasexual methods. 

Though still available as a useful method of gene identification in 
somatic cell gene mapping, the electrophoretic separation of gene prod¬ 
ucts is no longer necessary for parasexual genetics and is in fact sel¬ 
dom used. The limitation of mapping only those genes whose products 
are expressed in cultured cells has been overcome by the use of molecu¬ 
lar probes and PCR primers that distinguish host and recipient DNA. 
One useful mapping tool provided by recombinant DNA technology is 
the use of Southern blotting for restriction fragment mapping. Suc¬ 
cessfully applied to the mapping of mouse and human genes (Ruddle, 
1981; Shows et al ., 1982) as well as to many cattle genes as will be 
described in detail later, this method depends on differences in the 
location of restriction enzyme sites in the host and donor genomes. 
Heterologous probes have proved extremely useful in mapping live¬ 
stock species in which only a small number of genes have been cloned 
and sequenced. Use of these probes provides a convenient entry into 
comparative gene mapping as well. 

Both enzyme electrophoresis and Southern blotting with probe hy¬ 
bridization have given way to the PCR as the analytical tool of choice 
in synteny mapping. Primers from conserved regions of genes often 
amplify across species, providing the same advantages of heterologous 
probes. As with other methods, conditions and primers must be such 
that the presence or absence of the DNA of one species can be deter¬ 
mined against the background genome of another. 

Somatic cell genetics is still the most common method for building 
synteny maps. Hybrid somatic cells can be constructed such that the 
chromosomes of practically any progenitor species are preferentially 
lost. Each hybrid clone will retain a partial genome of that species 
along with the complete genome of the other, which is usually a trans¬ 
formed rodent cell line. Because chromosome loss is more or less ran¬ 
dom, each clone will retain a different subset of chromosomes from the 
species being mapped. As in human gene mapping, analysis of pairs of 
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genes in a panel of hybrid cell lines will reveal concordance or discor¬ 
dance of their retention. Concordance of retention is evidence for the 
location of two genes on the same chromosome. Conversely, discor¬ 
dance of retention is evidence for asynteny, their location on different 
chromosomes. Gene products or DNA sequences may be mapped by 
synteny analysis in any species so long as the presence or absence of 
the gene or gene product of the targeted species can be ascertained 
against the fully retained rodent genomic background. Enzyme electro¬ 
phoresis, Southern blotting with unique sequence probes, and PCR 
amplification with species discriminating primers have all been effec¬ 
tive analytical tools for synteny mapping. 

Somatic cell genetics does not typically assign markers to specific 
chromosomal sites or even to chromosomal regions. Consequently, 
genes on a synteny map are usually unordered. Somatic cell methods 
using rearranged chromosomes are an exception to this generalization 
and have been used very effectively to order genes in the human map. 

Radiation hybrid mapping (Cox et al ., 1990; Walter et al., 1994) has 
become an important tool for constructing high resolution maps of 
human chromosomes. The techniques used are variations of basic so¬ 
matic cell genetics in which the donor cells have been irradiated to 
achieve chromosome fragmentation. Statistical analysis is based on 
the principles of linkage analysis, i.e. the closer two loci are to each 
other the less likely they are to be separated by random chromosomal 
rearrangement. First used by Goss and Harris (1975), the technique 
can be used with single chromosome hybrids as the irradiated donor 
(Cox et al., 1990) or with total genome irradiation in a diploid donor cell 
(Walter et al., 1994). Whether used in mapping a single chromosome or 
a whole genome, the technology is effective for constructing contiguous 
maps of mammalian chromosomes at a 500-kb level of resolution. This 
method may prove to be the ideal approach to comparative gene map¬ 
ping because it provides an ordered map without the requirement of 
segregating polymorphisms in breeding populations. 


B. In Situ Hybridization 

Unique sequences, repetitive elements, and whole genomes have all 
been successfully localized to chromosomal sites by in situ hybridiza¬ 
tion. This technique uses the attachment of a microscopically detect¬ 
able marker to a DNA probe followed by hybridization of the probe to 
denatured DNA of an otherwise intact chromosome. The specificity of 
hybridization is determined by the uniqueness of the probe. While 
radioactive probes dominated the early application of this technology, 
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fluorescent probes are now generally used. In her review of fluores¬ 
cence in situ hybridization (FISH), Trask (1991) notes the following 
advantages of FISH over isotopic labeling. It provides superior spatial 
resolution and generally requires the visualization of fewer labeled 
chromosomes. It is faster and the probe is more stable. The sensi¬ 
tivities are similar, each requiring a few kb pairs of uninterrupted 
sequence as the hybridizing probe. Schemes have been developed that 
allow multiple probes with different color signals to be used on the 
same chromosomes. The latter is particularly important for gene map¬ 
ping because it provides the potential for ordering loci within the limits 
of approximately 100-kb resolution. 

Cosmids, cloning vectors that accommodate large DNA sequences 
(up to 45 kb), have been effectively used as probes for FISH. Because 
these large inserts often contain repetitive DNA, the target DNA must 
first be prehybridized with unlabeled total genomic DNA. As described 
later, this method has been effectively used to anchor linkage maps to 
chromosomes by hybridizing cosmids that contain highly polymorphic 
markers used in linkage mapping. 

The technologies just discussed result in physical maps that describe 
the physical relationships of loci to the chromosomes on which they 
reside. Higher resolution physical maps that define markers in contig¬ 
uous clones (contig maps) are forthcoming in livestock but will most 
likely span small genomic regions of special interest rather than whole 
chromosomes as is targeted in the human genome initiative and is 
prerequisite to total genome sequencing. 


C. Linkage Mapping 

Linkage maps are defined in biological rather than physical terms, a 
map unit representing 1% recombination in meiosis. Because linkage 
is measurable only in gametic products, linkage mapping requires de¬ 
tection of maternal and paternal alleles in gametes of heterozygous 
individuals. A linkage map is then made from the percentage of recom¬ 
binants in the parental arrangements of alleles of two loci on a chromo¬ 
some. The segregation of three or more loci permits ordering of genes 
on the map because double recombinants are rare relative to single 
recombinants. 

Markers for mapping purposes have been catagorized as Type I or II 
by O’Brien (1992). Type I markers are expressed sequences (genes) and 
are usually conserved from one mammalian species to another. Thus, 
they are the material for comparative gene mapping. Unfortunately, 
they are usually not highly polymorphic and therefore are difficult to 
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incorporate into linkage maps. Type II markers are highly polymorphic 
and are more widely used for linkage mapping. The value of these 
markers lies in their high level of polymorphism. They are not neces¬ 
sarily expressed genes but often anonymous stretches of DNA. 

The polymorphic DNA markers that dominate the linkage maps of 
most plant and animal species are one of two general classes. Restric¬ 
tion fragment length polymorphisms (RFLPs) identify alleles on the 
basis of polymorphism in the presence or absence of specific restriction 
enzyme cleavage sites. Genomic DNA is digested with a particular re¬ 
striction enzyme, separated by electrophoresis, and blotted onto a solid 
membrane (Southern, 1975). Hybridization to a labeled probe often 
reveals polymorphism in the size of the DNA fragments produced. One 
major advantage of RFLPs as linkage markers is that they can identify 
polymorphisms in or around Type I markers, assuming the probe used 
for detection is a cloned gene or complementary DNA (cDNA). They 
are, therefore, useful for comparative mapping and also provide a po¬ 
tential source of variation in candidate genes for economic traits. Blot¬ 
ting and probe development are laborious and time consuming, how¬ 
ever, and blotting exhausts DNA resources much faster than do PCR- 
based methods. 

Microsatellites (Weber and May, 1989; Fries et al., 1990) have be¬ 
come the Type II marker of choice in linkage mapping in most animal 
species. These islands of di-, tri-, or tetranucleotide repeats are proving 
to be ubiquitous throughout animal genomes. The number of repeats is 
usually the polymorphic entity, with a half dozen or more alleles not 
uncommon in a breeding population. Microsatellites are defined by 
unique flanking sequences that serve as primers for PCR amplification 
and also as tags of unique sites in the respective genome. 

Other types of markers on linkage maps include enzyme polymor¬ 
phisms, blood group antigens, and phenotypic traits such as coat color 
or horn development. These all add interest and biological significance 
to the map. Moreover, they provide potential for enhanced comparative 
mapping across species. 

Linkage mapping requires observable meiotic products, usually in 
the form of offspring from individuals segregating the markers just 
discussed. Large full sib-ships or half sib-ships are ideal for linkage 
analysis in farm animals. Because a comprehensive map requires large 
numbers of markers scored on a common set of meiotic products, shar¬ 
ing of family material among laboratories is the most effective ar¬ 
rangement for generating a linkage map. Sets of shared reference fam¬ 
ilies are available for most species, including cattle (Barendse et al., 
1994). Offspring are not essential for linkage analysis, however, be- 
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cause multiple DNA loci can now be analyzed for allelic variation di¬ 
rectly on meiotic products. Van Eijk et al. (1993) successfully ordered 
three loci on bovine chromosome 23 by PCR after primer extension 
preamplification (PEP) of individual sperm. 

D. Comparative Mapping 

Comparative gene mapping depends entirely on the accurate assess¬ 
ment of homology between genes of different species. It is not always 
easy to make this assessment. The issue of gene homology has been 
addressed repeatedly in reports of the Committee on Comparative 
Mapping, heretofore an integral component of the Human Gene Map¬ 
ping (HGM) workshop. Although the technologies for detecting genes 
have evolved at a rapid rate, the principles as outlined in HGM 10 can 
still be regarded as standards for the assessment of interspecific gene 
homology (Lalley et al., 1989). These principles are as follows: 

1. Molecular structure 

a. Similar nucleotide or amino acid structure 

b. Similar immunological cross-reaction 

c. Similar subunit structure 

d. Formation of functional heteropolymeric molecules in inter¬ 
specific cell hybrids in cases of multimeric proteins 

e. Cross-hybridization to the same molecular probe 

2. Biological or biochemical function 

a. Similar tissue distribution 

b. Similar developmental time of appearance 

c. Similar pleiotropic effects 

d. Identical subcellular locations 

e. Similar substrate specificity 

f. Similar response to specific inhibitors 

The increased use of heterologous probes has had a major impact on 
comparative gene mapping. Although there is probably not a better 
single criterion for gene homology than cross-hybridization with a mo¬ 
lecular probe, the Committee on Comparative Mapping has recognized 
an inherent danger in using probes characterized in other laboratories 
for comparative mapping purposes. Obviously, a mislabeled probe used 
for mapping can lead to an erroneous comparative chromosomal as¬ 
signment. The committee strongly recommends that investigators re¬ 
ceiving probes from other laboratories perform the tests necessary to 
verify that the probe being used is the same as the one used in pre¬ 
viously published mapping experiments. These criteria include the 
following: 
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1. Verification of the restriction sites in the plasmid or phage clone 

2. Verification of several restriction fragments after Southern analy¬ 
sis of human or genomic DNA 

3. Partial sequence analysis of clones 

4. PCR homology with an included oligonucleotide 

5. Other traditionally accepted methods of probe verification 

If this recommendation is not generally followed, some misassign- 
ments are certain to occur and to result in significant problems. The 
same strict guidelines must be applied to genes analyzed by the PCR. 
Primer sequences may be selected from conserved regions of a gene in 
one species and used for amplification in another. Sufficient exon se¬ 
quence should be included in cross-species primer design to allow for 
sequencing and homology assessment. 

The previous discussion assumes the use of biochemical or molecular 
gene markers for comparative mapping. However, just as albino and 
pink-eye dilute were identified as homologous mutations in different 
species (Haldane et al., 1915; Castle and Wachter, 1924; Feldman, 
1924; Clark, 1936), mutations with similar phenotypic expression are 
still useful comparative markers. Unfortunately, there does not appear 
to be an abundance of obvious homologous mutations in different spe¬ 
cies. The mapping of two similar mutations to a known conserved 
region in two different species, however, is good preliminary evidence 
for homology of the mutations. 


IV. Current Status of Animal Genome Maps 

A. Physical Mapping 

More than 400 Type I loci have been mapped in cattle (Fries et al., 
1993; O’Brien et al,, 1993) primarily through somatic cell genetics. Of 
the 100 or so chromosomal assignments in the sheep map (Echard et 
al., 1994), most were made by synteny mapping. More than 80 of these 
are Type I markers. While these synteny-mapped markers indicate the 
boundaries of chromosomal conservation relative to the map-rich ge¬ 
nomes of mice and humans and show extensive sheep-cattle conserva¬ 
tion, they provide an incomplete comparative map. Conservation of 
gene order is not addressed by the synteny map. 

In situ hybridization, especially FISH, has been used effectively to 
address the order of Type I loci, to assign syntenic groups to specific 
chromosomes, and to anchor the rapidly growing linkage map to chro¬ 
mosomes (Fries et al., 1993; Solinas-Toldo et al., 1993; Iannuzzi et al.. 
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1993; Gallagher et al., 1993b). There are currently more than 50 in situ 
localizations of unique sequences on cattle chromosomes. All bovine 
syntenic groups are now assigned to specific chromosomes and the 
bovine linkage map is anchored at 35 sites on 26 chromosomes. Thirty- 
four in situ assignments in sheep were reviewed by Echard et al. 
(1994). At least 80 chromosomal assignments have been made in pigs, 
most by in situ hybridization (Andersson et al., 1993). Sixty of these 
are Type I markers. A promising new technique uses in situ PCR and 
permits the visualization of amplified sequences directly on porcine 
chromosomes (Troyer et al. 1994; Xie et al., 1995). 

B. Linkage Mapping 

The published bovine linkage map of 200 markers (Barendse et al., 
1994) has grown to almost 800 markers since its initial publication 
(Barendse, personal communication), which was made possible in 
large measure by international cooperation and the use of a common 
set of reference families. Combined with the independent development 
of other maps (Bishop et al., 1994), there are probably more than 1200 
markers currently assigned to cattle linkage groups. The goal of 20 cM 
resolution has clearly been achieved over at least 95% of the total 
genome. The international map includes 65 type I markers. A pig link¬ 
age map of 128 markers, 60 of which are informative for comparative 
mapping, was published by Ellegren et al. (1994). Rohrer et al. (1994) 
published a 383 marker map of the pig, almost all of which were micro¬ 
satellites. Two reference families of chickens have each produced link¬ 
age maps in excess of 200 markers (Burt et al, 1995). 

C. Comparative Mapping 

We began comparative mapping in cattle in our laboratory by con¬ 
structing a panel of hybrid somatic cells and assessing gene segrega¬ 
tion by enzyme electrophoresis (Womack and Moll, 1986). Initially, 
these cattle-hamster hybrid somatic cells were analyzed by cellulose- 
acetate electrophoresis for 28 enzyme gene products, including such 
markers as GAPD, ITPA, ADA, ACOl, GDH, GUK, CAT, and GLOl. 
The 28 loci were organized into 21 independent syntenic groups bring¬ 
ing the composite bovine gene map at that time to 35 loci on 24 syn¬ 
tenic groups. A total of 32 homologous genes were then mapped in 
humans, mice, and cattle. Conservation of cattle and human linkage 
groups was evidenced by only three syntenic discordancies among 32 
loci as contrasted to nine discordancies among the same loci in the 
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human and mouse maps. This was the initial suggestion that cattle 
and human chromosomal conservation exceeded that of either to the 
laboratory mouse. 

Another different but related somatic cell approach facilitated the 
mapping of the PRGS and PAIS genes in cattle (McAvin et al. y 1988). 
Individual cow-hamster hybrid cell lines established by fusion of mu¬ 
tant CHO cells, ade-C and ade-G, with cattle leukocytes required com¬ 
plementing bovine genes for PRGS and PAIS, respectively, when prop¬ 
agated on selective media. Homogenates of 12 PRGS+ hybrid clones 
and 12 PAIS f hybrid clones retained the bovine electromorph of SOD1 
while extensively segregating 14 biochemical markers of other cattle 
syntenic groups. Secondary cattle-hamster hybrid subclones, which 
segregated bovine PRGS and PAIS in late passages on nonselective 
media, concordantly segregated bovine SOD1. These data supported a 
syntenic relationship among PRGS, PAIS, and SOD1 on cattle syntenic 
group U10. An interferon receptor (IFREC) locus was previously 
known to be syntenic with SOD1. This synteny demonstrated an exten¬ 
sive conservation of bovine U10 and the Down syndrome region of 
human chromosome 21. Moreover, it expanded the boundaries of so¬ 
matic cell genetics in domestic animals to use of auxotrophic recipient 
cell lines to preferentially retain a specific donor chromosome. 

Verification of these results and expansion of the map of the SOD 
gene family were provided by Gallagher et al. (1992a). cDNA probes of 
human extracellular superoxide dismutase (EC-SOD) and bovine su¬ 
peroxide dismutase 1 (SOD1) genes were hybridized to Southern blots 
containing genomic DNAs from the same cow-rodent somatic cell lines 
segregating bovine chromosomes. The SOD1 probe identified two loci: 
the coding locus (SOD1), which mapped to bovine U10; and a related 
locus (SOD1L), which mapped to Ull. EC-SOD mapped to bovine U15. 
The mapping of EC-SOD to U15 further defined a region of extensive 
syntenic conservation between humans and domestic cows. 

Mapping families of genes is an efficient and biologically interesting 
approach to the comparative mapping of animal genomes. DNA from 
the bovine-hamster hybrid cell panel was analyzed by blot hybridiza¬ 
tion with alpha and beta interferon probes (Adkison et al, 1988a). 
Retention and loss of the bovine interferon genes was compared to 
segregation of bovine isozyme loci representing the previously de¬ 
scribed syntenic groups. Families of bovine alpha (IFNA) and beta 
(IFNB) interferon genes were segregated in concordance with each 
other and with aconitase-1 (ACOl) on bovine syntenic group U18. This 
syntenic relationship is conserved on human chromosome 9p and on 
the portion of mouse chromosome 4 proximal to the centromere. In 
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addition, cattle RFLPs were identified with both IFNA and IFNB 
probes. These polymorphisms provided valuable tools for subsequent 
searches for relationships of host resistance to disease and variation in 
candidate genes. 

These studies were expanded to another chromosomal region by Ad- 
kison et al. (1988b). DNA from cow-hamster and cow—mouse somatic 
hybrid cells segregating bovine chromosomes were analyzed by South¬ 
ern blotting and hybridization with human fibronectin and gamma 
crystallin probes. Concordance of retention of these bovine genes was 
compared to cattle isozyme loci representing all known syntenic 
groups. Bovine fibronectin (FN1) and gamma crystallin (CRYG) frag¬ 
ments were concordant with each other and with isocitrate dehy¬ 
drogenase 1 (IDHl), representing bovine syntenic group U17. The syn¬ 
tenic relationship of these genes is conserved on human chromosome 
2q and also on mouse chromosome 1. In addition, bovine RFLPs were 
identified with both fibronectin and gamma crystallin probes. These 
polymorphisms proved valuable for the study of recombination be¬ 
tween the syntenic loci in pedigreed herds and to mark a segment of 
the bovine genome that is homologous to the Lsh region of mouse 
chromosome 1, which confers resistance in mice to several intracellu¬ 
lar parasites. 

Hallerman et al. (1988) mapped the genes encoding bovine prolactin 
and rhodopsin to syntenic groups on the basis of hybridization of DNA 
from hybrid cells with cloned prolactin and rhodopsin gene probes. 
Prolactin was found to be syntenic with previously mapped glyoxalase, 
BoLA, and 21-hydroxylase genes, establishing a syntenic conservation 
with human chromosome 6. The presence of bovine rhodopsin se¬ 
quences among the various hybrid cell lines was not concordant with 
any gene previously assigned to one of the 23 defined autosomal syn¬ 
tenic groups. Thus, rhodopsin marked a new bovine syntenic group, 
leaving only five cattle autosomes unmarked at that time by at least 
one biochemical or molecular marker. 

Parathyroid hormone and the beta hemoglobin gene cluster, which 
are closely linked on human chromosome llpl5, were localized to 
bovine syntenic group (U7) with the gene for catalase by Foreman and 
Womack (1989). RFLPs were followed through informative pedigrees 
to determine a linkage map distance of approximately 15 cM between 
the parathyroid hormone and hemoglobin genes. This work began the 
integration of the rapidly growing synteny map to the fledgling linkage 
map of the bovine genome. 

The immunoglobulin genes are an important gene family both from 
the evolutionary and host-resistance perspective. Their mapping was 
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the target of a study by Tobin-Janzen and Womack (1992). Compara¬ 
tive gene mapping in mammals suggested that the bovine immu¬ 
noglobulin heavy chain genes, IGHG4 and IGHM might be syntenic 
with the FOS oncogene. Interestingly, however, when these genes were 
assigned to bovine syntenic groups in our panel of hybrid somatic cells, 
IGH genes were shown to be syntenic with the FES oncogene rather 
than FOS. IGH and FES were assigned to Bos taurus chromosome 21 
while FOS was assigned to chromosome 10. In addition, bovine-specific 
immunoglobulin-like sequences were observed in the hybrid somatic 
cells, and one, IGHML1, was mapped to bovine syntenic group U16. 
The probes used for somatic cell mapping were also used to screen a 
small number of cattle of several different breeds for RFLPs. IGHG4 
and IGHM were shown to be highly polymorphic, while FOS and FES 
were not. 


Two bovine DNA probes (LC a and LC b ) complementary to the 
clathrin light chain genes were hybridized to DNA from the bovine- 
hamster hybrid somatic cell lines by Gallagher et al . (1991). Concor- 
dancy of retention of the clathrin genes was compared to existing syn¬ 
tenic data for the domestic cow, which, by this time, consisted of a 


substantial data base. LC b identified a single locus, CLTB, concordant 
with the genes encoding bovine anti-Mullerian hormone (AMH) and 
bovine osteonectin from bovine syntenic group U22. LC a recognized 
two loci: CLTAL1, from a previously unidentified bovine syntenic 
group; and CLTAL2, which is concordant with GGTB2, a gene marker 


for bovine syntenic group U18. 

A significant advance in the bovine genome map was precipitated by 
a combination of somatic cell genetics and in situ hybridization (Fries 
et al., 1991). The chromosomal locations of bovine class I and class II 


cytokeratin sequences were determined in the Fries laboratory using 
in situ hybridization and in our laboratory using Southern blot hybrid¬ 
ization to DNA from the hybrid somatic cells to establish syntenic 


relationships. The main signals were found over chromosome region 
19ql6 —> qter after in situ hybridization with two probes for the class I 


cytokeratin gene subfamily (KRT10 and KRT19) and over region 5ql4 
—» q23 after hybridization with probes for the class II gene subfamily 
(KRT1, KRT5, and KRT8). These regions are thought to contain the 


loci of functional cytokeratin genes, with KRT10 and KRT19 mapping 
to 19q21 and KRT1, KRT5, and KRT8 to 5q21. The in situ hybridiza¬ 
tion data were then corroborated by analysis of the somatic hybrid cell 
panel. The genes for the class I keratins segregated concordantly with 
each other and with syntenic group U21 in the somatic cell panel but 
were discordant with the class II keratin genes. The class II keratin 
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genes segregated concordantly with each other and with syntenic 
group U3. Two class II gene probes gave an additional minor signal 
above chromosome region 5q25 —» q33 after in situ hybridization, while 
another class II probe yielded a minor signal above chromosome region 
10q31 —> qter. When the latter probe and an additional linked probe 
were hybridized to DNA from the hybrid panel, two independently 
segregating loci were recognized, one of which cosegregated with the 
class II subfamily in syntenic group U3 and the other with syntenic 
group U5. These data confirmed the chromosomal assignment of two 
syntenic groups and allowed the assignment of a formerly unassigned 
syntenic group to a specific chromosome, a significant step at that time 
in the growth of the gene map. 

Bovine genes encoding T-cell receptor, CD3, and CD8 molecules were 
mapped to syntenic groups in hybrid somatic cells by Li et al. (1992). 
T-cell receptor a and 8 chains were assigned to bovine syntenic group 
U5, and the (3 and y genes were syntenic with each other and with 
markers on U13. CD3E and CD3D genes were syntenic with each other 
and located to bovine syntenic group U19. CD8 was most concordant 
with markers of syntenic group U16. These data expanded the compar¬ 
ative gene maps of human chromosome 7, bovine syntenic group U13, 
and mouse chromosomes 6 and 13 and suggested extensive evolution¬ 
ary conservation among the mammalian taxa. 

To establish syntenic relationships of phototransduction genes, Gal¬ 
lagher et al, (1992b) mapped the genes encoding the a-, (3-, and 
y-subunits of rod cGMP phosphodiesterase (PDE) (PDEA, PDEB, 
PDEG), the a'-subunit of cone PDE (PDEA2), and the rod cGMP-gated 
channel (CNCG) to bovine syntenic groups. The rod cGMP PDE a-, [3-, 
and y-subunit genes mapped to bovine syntenic groups U22, U15 (chro¬ 
mosome 6), and U21 (chromosome 19), respectively. The rod cGMP- 
gated channel gene also mapped to syntenic group U15, and the bovine 
cone a'-subunit gene mapped to U26 (chromosome 26). With the excep¬ 
tion of the cone PDE a'-subunit gene, which had not been mapped in 
other mammals, all of these genes were assigned to conserved chromo¬ 
somal regions shared among bovine, human, and mouse. A compilation 
of known syntenic assignments and hypotheses regarding future as¬ 
signments of phototransduction genes in human, mouse, and cattle 
was used to demonstrate the value of the bovine map for comparative 
genomic predictions of gene location in humans and mice. 

A phage library of bovine genomic DNA was screened for hybridiza¬ 
tion with a human HSP70 cDNA probe by Grosz et al. (1992) and 21 
positive plaques were identified and isolated. Restriction mapping and 
blot hybridization analysis of DNA from the recombinant plaques dem- 
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onstrated that the cloned DNAs were derived from three different re¬ 
gions of the bovine genome. One region contained two tandemly ar¬ 
rayed HSP70 sequences, designated HSP70-1 and HSP70-2, separated 
by approximately 8 kb of DNA. Single HSP70 sequences, designated 
HSP70-3 and HSP70-4, were found in two other genomic regions. Lo¬ 
cus-specific probes of unique flanking sequences from representative 
HSP70 clones were hybridized to restriction endonuclease-digested 
DNA from the somatic cell hybrid panel to determine the chromosomal 
location of the HSP70 sequences. The probe for the tandemly arrayed 
HSP70-1 and HSP70-2 sequences mapped to bovine chromosome 23, 
syntenic with glyoxalase 1,21 steroid hydroxylase, and major histo¬ 
compatibility class I loci. HSP70-3 sequences mapped to bovine chro¬ 
mosome 10, syntenic with nucleoside phosphorylase and murine osteo¬ 
sarcoma viral oncogene (v-/os), and HSP70-4 mapped to bovine 
syntenic group U6, syntenic with amylase 1 and phosphoglucomutase 
1. Based on these data, bovine HSP70-1,2 was proposed to be homolo¬ 
gous to human HSPA1 and HSPA1L on chromosome 6p21.3, bovine 
HSP70-3 was proposed as the homolog of an unnamed human HSP70 
gene on chromosome 14q22-q24, and bovine HSP70-4 was proposed 
homologous to one of the human HSPA-6, -7 genes on chromosome 1. 
Thus, comparative mapping was used as a tentative assessment of 
gene homology in a complex gene family. 

Ryan et al . (1992) confirmed the previous assignment of bovine a- 
(IFNA) and P- (IFNB) interferon gene families to syntenic group U18 
and assigned this syntenic group to chromosome 8. FISH localized 
these genes to bovine chromosome 8, band 15, and demonstrated that 
with biotinylated plasmids, as few as five tandemly arrayed sequences 
can be detected by conventional fluorescent microscopy. 

Amplification of an ancestral lysozyme gene in artiodactyls appears 
to be associated with the evolution of foregut fermentation in the rumi¬ 
nant lineage and has resulted in about 10 lysozyme genes in true ru¬ 
minants. Hybridization of a cow stomach lysozyme 2 cDNA clone to re¬ 
stricted DNA from our panel of cow-hamster hybrid cell lines revealed 
that all but one of the multiple bovine-specific bands segregated con- 
cordantly with the marker for bovine syntenic group U3 (Gallagher et 
al., 1993a). The anomalous band was subsequently mapped to bovine 
syntenic group U22 (chromosome 7) with a second panel of hybrids 
representing all 31 bovine syntenic groups. By two-dimensional 
pulsed-field gel electrophoresis, the lysozyme genes on cattle U3 (chro¬ 
mosome 5) were shown to be clustered on a 2- to 3-Mb DNA fragment, 
while the lactalbumin gene and pseudogenes that are paralogous and 
syntenic with the lysozymes were outside the lysozyme gene cluster. 
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FISH of a cocktail of lysozyme genomic clones localized the lysozyme 
gene cluster to cattle chromosome 5, band 23, corroborating the so¬ 
matic cell assignment. 

The asymmetric unit membrane (AUM) of the apical surface of mam¬ 
malian urinary bladder epithelium contains several major integral 
membrane proteins, including uroplakins IA and IB (both 27 kDa), II 
(15 kDa), and III (47 kDa). These proteins are synthesized only in 
terminally differentiated bladder epithelial cells. They are encoded by 
separate genes and, except for uroplakins LA and IB, appear to be 
unrelated in their amino acid sequences. The genes encoding these 
uroplakins were mapped by Ryan et al. (1993) to chromosomes of cattle 
through their segregation in this same panel of bovine-rodent somatic 
cell hybrids. Genes for uroplakins IA, IB, and II were mapped to bovine 
chromosomes 18 (UPK1A), 1 (UPK1B), and 15 (UPK2), respectively. 
Two bovine genomic DNA sequences reactive with a uroplakin III 
cDNA probe were identified and mapped to BTA 6 (UPK3A) and 5 
(UPK3B). They also mapped genes for uroplakins IA and II in mice, to 
the proximal regions of mouse chromosomes 7 ( Upkla ) and 9 ( Upk2 ), 
respectively, by analyzing the inheritance of restriction fragment 
length variants in recombinant inbred mouse strains. These assign¬ 
ments are consistent with linkage relationships known to be conserved 
between cattle and mice. The mouse genes for uroplakins IB and III 
were not mapped because the mouse genomic DNA fragments that 
hybridized with these probes were invariant among the inbred strains 
tested. Although the stoichiometry of AUM proteins is nearly constant, 
the fact that the uroplakin genes are unlinked indicates that their 
expression might be independently regulated. These results also pre¬ 
dict likely positions for two human uroplakin genes. 

These studies demonstrate the success of the gene family approach 
to developing the map of the cattle genome, an approach equally appli¬ 
cable to any species for which hybrid cell lines are available. Another 
approach has been to directly evaluate regions of possible chromosome 
conservation by mapping in cattle genes that span the entirity of a 
specific human (or mouse) chromosome. Using the same panel of hy¬ 
brid somatic cells, sequences homologous to genes spanning human 
chromosome arm 8q were syntenically assigned in cattle by Threadgill 
et al. (1990b). Thyroglobulin (TG), carbonic anhydrase II (CA2), and 
the proto-oncogenes myc and mos were assigned to a newly identified 
bovine syntenic group, U23. Additionally, in situ hybridization of the 
TG probe to bovine metaphase chromosomes revealed this syntenic 
group to be on bovine chromosome 14 and the bovine TG gene to reside 
at 14ql2 -> ql5. 
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Four genes having homologous loci on the short arm of human chro¬ 
mosome 8 were mapped to two different bovine syntenic groups 
(Threadgill and Womack, 1991b). The gene coding for the tissue-type 
plasminogen activator mapped with GSR, a human chromosome 8 
marker of bovine syntenic group U14, while lipoprotein lipase and the 
medium and light neurofilament polypeptide genes were shown to be 
syntenic with the human chromosome 9 marker GGTB2, which is on 
bovine syntenic group U18. 

The somatic cell panel was also used to investigate the syntenic 
relationship of nine loci in the bovine that have homologous loci on 
human chromosome 9 (Threadgill and Womack, 1990). Six loci— 
ALDH1, ALDOB, C5, GGTB2, GSN, and ITIL—were assigned to the 
previously identified bovine syntenic group U18 represented by ACOl, 
whereas the other three loci—ABL, ASS, and ARP78—mapped to a 
new, previously unidentified autosomal syntenic group. Additionally, a 
secondary locus, ABLL, which cross-hybridized with the ABL probe, 
was mapped to bovine syntenic group U1 with the HSA 1 loci PGD and 
ENOl. The results predicted that ACOl would map proximal to 
ALDHl; GRP78 distal to ITIL and C5; GSN proximal to AK1, ABL, 
and ASS on HSA 9; GRP78 to MMU 2; and ITIL and GSN to MMU 4. 

Hybrid somatic cells were used again to investigate the syntenic 
relationship of nine loci in the bovine that have homologous loci on 
human chromosome 12 (Threadgill et al, 1990a). Eight loci, including 
A2M, GLI, HOX3, IFNG, INTI, KRAS2, NKNB, and PAH, were as¬ 
signed to the previously identified bovine syntenic group U3 repre¬ 
sented by GAPD. However, a single locus from the q-terminus of HSA 
12, ALDH2, mapped to a new, previously unidentified autosomal syn¬ 
tenic group. These results indicated the existence of a very large ances¬ 
tral syntenic group spanning from the p-terminus to q24 of HSA 12 
and containing over 4% of the mammalian genome. Additionally, the 
results predicted that ALDH2 is distal to PAH and IFNG on HSA 12, 
the type II keratin gene complex resides between qll and q21 of HSA 
12, A2M will map to MMU 6, and LALBA and GLI are on MMU 15. 

To determine the extent of conservation between bovine syntenic 
group U10, human chromosome 21 (HSA 21), and mouse chromosome 
16 (MMU 16), 11 genes were physically mapped by segregation analy¬ 
sis in somatic cells by Threadgill et al. (1991). The genes chosen for 
study spanned MMU 16 and represented virtually the entire q arm of 
HSA 21, Because the somatostatin (SST) gene was previously shown to 
be in U10, the transferrin (TF) gene, an HSA 3/MMU 9 marker, was 
also mapped to determine whether U10 contained any HSA 3 genes not 
represented on MMU 16. With the exception of the protamine gene 
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PRM1 (HSA 16/MMU 16), all of the genes studied were syntenic on 
bovine U10. Thus, all homologous loci from HSA 21 that have been 
studied in the cow are on a single chromosome. The bovine homolog of 
HSA 21 also carries several HSA 3 genes, two of which have homolo¬ 
gous loci on MMU 16. The syntenic association of genes from the q arm 
of HSA 3 with HSA 21 genes in two mammalian species, the mouse and 
the cow, indicated that HSA 21 may have evolved from a larger ances¬ 
tral mammalian chromosome that contained genes now residing on 
HSA 3. Additionally, the syntenic association of TF with SST in the 
cow permits the prediction that the rhodopsin (RHO) gene is proximal 
to TF on HSA 3q. 

Loci homologous to those on human chromosome 10 (HSA10) map to 
five mouse chromosomes: MMU2, MMU7, MMU10, MMU14, and 
MMU19. In cattle, one unassigned syntenic group (U26) was previ¬ 
ously defined by the HSA10/MMU19 isoenzyme marker glutamic-ox¬ 
aloacetic transaminase 1 (GOT1). To evaluate the syntenic arrange¬ 
ment of other HSA10 loci in cattle, seven genes were physically 
mapped by Threadgill and Womack (1991a). The genes mapped include 
vimentin (VIM) on HSA10 and MMU 2; interleukin 2 receptor (IL2R) 
on HSA10 and MMU?; ornithine aminotransferase (OAT) on HSA10 
and MMU7; hexokinase 1 (HK1) on HSA10 and MMU10; retinol-bind¬ 
ing protein 3 (RBP3) on HSA10 and MMU14; plasminogen activator, 
urokinase type (PLAU) on HSA10 and MMU14; and a-2-adrenergic 
receptor (ADRA2) on HSA10 and MMU 19. VIM and IL2R mapped to 
bovine Ull; ADRA2 and OAT mapped to bovine U26; and RBP3, 
PLAU, and KH1 mapped to bovine U28. 

Homologs to genes residing on human chromosome 3 (HSA 3) map to 
four mouse chromosomes (MMU) 3, 6, 9, and 16. In the bovine, two 
syntenic groups that contain HSA 3 homologs, unassigned syntenic 
groups 10 (U10) and 12 (U12), were known. U10 also contains HSA 21 
genes, which is similar to the situation seen on MMU16; whereas, U12 
apparently contains only HSA 3 homologs. The syntenic arrangement 
of other HSA 3 homologs in the bovine was investigated by Threadgill 
and Womack (1991c). The genes mapped include Friend-murine leuke¬ 
mia virus integration site 3 homolog (FIM3; HSA 3/MMU 3), sucrase- 
isomaltase (SI) and glutathione peroxidase 1 (GPX1) (HSA 3/MMU ?), 
murine leukemia viral iy-raf- 1) oncogene homolog 1 (RAF1; HSA 
3/MMU 6), and ceruloplasmin (CP; HSA 3/MMU 9). FIM3, SI, and CP 
mapped to bovine syntenic group U10, while RAF1 and GPX1 mapped 
to U12. 

Genes homologous to those located on human chromosome 4 (HSA4) 
were mapped in the bovine by Zhang et al. (1992) to determine regions 
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of syntenic conservation among humans, mice, and cattle. Previous 
studies had shown that two homologs of genes on HSA4, PGM2, and 
PEPS were located in bovine syntenic group U15 (chromosome 6). The 
homologous mouse genes, Pgm-1 and Pep-7, are on MMU5. Using the 
panel of bovine-hamster hybrid somatic cells, they assigned homologs 
of 11 additional HSA4 loci to their respective bovine syntenic groups. 
D4S43, D4S10, QDPR, IGJ, ADH2, KIT, and IF were assigned to syn¬ 
tenic group U15. This syntenic arrangement is not conserved in the 
mouse, where D4s43, D4sl0, Qdpr, and Igj are on MMU5 while Adh-2 
is on MMU3. IL-2, FGB, FGG, and Fll, which also reside on MMU3, 
were assigned to bovine syntenic group U23. These data suggested 
that breaks or fusions of ancestral chromosomes carrying these genes 
occurred at different places during the evolution of humans, cattle, and 
mice. 

In an effort to generate a more complete bovine syntenic map of Type 
I comparative anchor loci, seven homologs to genes found on HSA5 
were mapped by Zhang and Womack (1992). Five HSA5 genes—CSF2, 
RPS14, PDGFRB, FGFA, and CSF1R—were assigned to bovine syn¬ 
tenic group U22 (chromosome 7), while two others—C9 and HGMCR— 
mapped to U10 and U5, respectively. Previous studies had assigned 
the HSA5 marker SPARC to bovine syntenic group U22. The mapping 
of genes spanning the length of HSA5 in cattle and also in mouse 
permits syntenic comparisons between prototypic genomes of three 
mammalian orders, providing insight into the evolutionary history of 
this region of the ancestral mammalian genome. 

The studies just discussed demonstrated a high level of conservation 
between human chromosomes and bovine syntenic groups. One such 
comparison was between human chromosome 12 and bovine chromo¬ 
some 5, where at least 16 loci have been shown to be conserved in a 
single homologous segment. However, the degree of conservation of 
order of the loci on bovine chromosome 5 was unknown and, in general, 
the conservation of order in comparisons between humans and cattle 
can only be speculated. We began studying the conservation of gene 
order by estimating the recombination fractions between five of the loci 
that were previously published as mapping to bovine chromosome 5 by 
a combination of in situ hybridization and analysis of bovine-rodent 
somatic cell hybrid lines to determine whether order has been con¬ 
served in the homologous segment of bovine chromosome 5 and human 
chromosome 12 (Barendse et aL, 1992). Recombination fractions were 
estimated in reference pedigrees of cattle for the loci A2M, GSNL, 
HOX3, INTI, KRAS2, and PAH. RFLPs for all loci were defined by 
screening a panel of eight restriction endonucleases. A multipoint 
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analysis was performed to estimate support for the most likely order. 
This map shows an inverted order of some of the loci. Moreover, the 
distance spanned in cattle is less than a quarter the distance spanned 
in humans. Together, these data indicate that several chromosomal 
evolutionary events have occurred in the homologous segment shared 
by humans and cattle. This study emphasized the demand for extend- 
ing comparative gene mapping to a study of comparative order of genes 
within conserved segments of synteny. 

These studies relied almost exclusively on hybridization to molecu¬ 
lar probes. The advent of the PCR, however, opened new doors for 
rapid expansion of the genome maps of domestic animals. The PCR 
was combined with hybrid somatic cell technology to extend the bovine 
physical map by Dietz et al. (1992). Eight bovine loci—glycoprotein 
hormone alpha (CGA), coagulation factor X (F10), chromogranin A 
(CHGA), low-density lipoprotein receptor (LDLR), human pro- 
chymosin pseudogen (CYM), oxytocin (OXT), arginine-vasopressin 
(ARVP), and cytochrome oxidase c subunit IV pseudogene (COXP)— 
were assigned to bovine syntenic groups with this approach. CGA was 
assigned to bovine syntenic group U2, F10 to U27, CHGA to U4 [bovine 
chromosome 21], LDLR to U22, CYM to U6, OXT and ARVP to Ull, 
and COXP to U3 (bovine chromosome 5). Seven of these genes, CGA, 
F10, CHGA, LDLR, OXT, ARVP, and CYM, further delineated regions 
of chromosomal conservation on human chromosomes 6, 13, 14, 19, 20, 
20, and 1, respectively CHGA, OXT, and ARVP were yet unmapped in 
the mouse. Comparative mapping predicted that mouse CHGA would 
map to chromosome 12, and mouse OXT and ARVP would map to 
mouse chromosome 2. Furthermore, human CYM was predicted to be 
sublocalized to Ip32-q21. The primers developed for these eight loci 
were the beginning of a collection of bovine expressed sequence tags for 
characterizing the hybrid cell panel. 

Neibergs et al. (1993) used the PCR primers designed to amplify 
bovine specific sequences of the ARVP, CGA, COXP, CYM, F10, inhibin 
A (INHBA), LDLR, and OXT genes in hybrid cells in a search for 
single-strand conformation polymorphisms in cattle populations. DNA 
from 75 animals comprising cross-bred and 7 purebred breeds were 
analyzed. ARVP, COXP, CYM, LDLR, and OXT were found to be poly¬ 
morphic while CGA, F10, and INHBA were not. Polymorphic regions 
were identified within 206 bp of exon 1 of ARVP, 582 bp of the pseu¬ 
dogene COXP, 253 bp of exon 9 of CYM, 519 bp of LDLR cDNA, and 160 
bp of the upstream regulatory region of OXT. This was the first report 
of bovine polymorphisms for these genes and an important step in 
incorporating type I comparative anchor loci into the bovine linkage 
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map. Polymorphic loci were subsequently analyzed in pedigreed full- 
sib families and shown to be inherited in a Mendelian fashion. 

Despite the success with RFLPs and single-strand conformation 
polymorphisms (SSCPs), it was apparent by 1993 that microsatellites 
would become the marker of choice for linkage mapping in domestic 
animals. Several studies were initiated to isolate and map bovine mi¬ 
crosatellites, including one by Steffen et al. (1993). A partial plasmid 
library with bovine genomic inserts of about 500 bp was screened with 
a (dC-dA) n (dG-dT) n oligonucleotide probe for the repeated nucleotide 
motif (CA)„. Eleven positive clones (0.3% of all colonies screened) were 
discovered and were subsequently isolated and sequenced. Eight mi¬ 
crosatellite loci were analyzed, one with eight alleles, one with seven 
alleles, three with six alleles, one with three alleles, and two with two 
alleles. Six of these microsatellites were mapped by PCR analysis of a 
panel of somatic hybrid lines. This initial synteny mapping was later 
followed by linkage mapping when the appropriate reference families 
were available and the best available microsatellites had been se¬ 
lected. 

Some markers have both Type I and Type II characteristics. Such is 
the bovine gene for the p21 ms protein activator (RASA), which includes 
in its 5' untranslated region a (TG)„ repeat. Analysis of this (TG) n 
repeat by PCR amplification of genomic DNA revealed a four-allele 
polymorphism. A cDNA probe was used to assign RASA to the region 
2.4-qter of bovine chromosome 7 by in situ hybridization. PCR analysis 
of our panel of somatic hybrid lines allowed the assignment of RASA to 
the unassigned syntenic group 22 (U22) and thus localizes U22 on 
chromosome 7 (Eggen et a/., 1992). 

With the advent of FISH, the bovine synteny groups were quickly 
assigned to chromosomes. A 260-bp genomic pstl fragment, which en¬ 
codes a portion of the carbohydrate recognition domain, for example, 
was used along with hybrid somatic cells to map the conglutinin gene 
(CGN1) to domestic cow (Bos taurus) syntenic group U29 by Gallagher 
et al. (1993). In turn, a cosmid containing the entire bovine CGN1 was 
used with FISH to sublocalize this gene to cattle (BTA) 28 band 18. 
Because BTA 28 and several of the other small acrocentric autosomes 
of cattle are difficult to discriminate, they also chromosomally sub¬ 
localized CGN1 to the p arm of the lone biarmed autosome of the gaur 
{Bos gaurus). The use of the gaur 2/28 Robertsonian as a marker chro¬ 
mosome and the assignment of CGN1 to BTA 28 helped resolve some of 
the early nomenclatural questions involving this cattle chromosome. 

The bovine linkage map developed quickly after the discovery of 
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microsatellites but only after the assembly of reference families for 
linkage analysis. Barendse et al. (1993a) reported the first genetic map 
of highly polymorphic index DNA loci in livestock for bovine chromo¬ 
some 21. This map consisted of six loci with an average heterozygosity 
of 82%, each with a minimum of five alleles, spaced at an average 
genetic distance of 9.7 cM, and covering most of the expected length of 
the acrocentric bovine chromosome 21. The order of markers along the 
chromosome was determined to be cen-ETH 131-UWCA 4-TGLA 337- 
TGLA 122-CSSM 18-GMBT 16-tel. There was heterogeneity reported 
among the recombination fractions between the sexes. 

To identify physical and genetic anchor loci on bovine chromosomes, 
13 cosmids, obtained after the screening of partial bovine cosmid li¬ 
braries with the (CA) n microsatellite motif, were mapped by FISH by 
Solinas-Toldo et al. (1993). Eleven cosmid probes yielded a specific 
signal on one of the bovine chromosomes and identified the following 
loci: D5S2, D5S3, D6S3, D8S1, D11S5, D13S1, D16S5, D17S2, D19S2, 
D19S3, D21S8. The microsatellite-containing regions were subcloned 
and sequenced. Primers were designed for eight of the nonsatellite 
cosmids, and seven of these microsatellites were polymorphic with be¬ 
tween three and eight alleles on a set of outbred reference families. The 
polymorphic and chromosomally mapped loci were subsequently used 
to physically anchor other bovine polymorphic markers by linkage 
analysis. The microsatellite primers were also applied to DNA samples 
of the now fully characterized panel of somatic hybrid cell lines, allow¬ 
ing the assignment of seven microsatellite loci to defined syntenic 
groups. These assignments confirmed earlier mapping results and 
placed two formerly unassigned syntenic groups on specific chromo¬ 
somes. 

Approximately 400 loci have been mapped in both cattle and hu¬ 
mans. Most of these have also been mapped in mice. While extensive 
conservation of synteny has been observed between cattle and humans 
(Womack and Moll, 1986; Threadgill et al 1991), conservation of link¬ 
age (conservation of gene order) may not be so prevalent. Barendse et 
al. (1994) incorporated a sufficient number of IVpe I loci into the link¬ 
age map to demonstrate the presence of several rearrangements of 
gene order within conserved syntenies. 

The sheep and cattle maps, as expected, are highly conserved with 
only three apparent differences (Echard et al., 1994). Developing these 
maps in parallel should enhance and economize the efforts targeted to 
each map. 

To make the process of comparative gene mapping in mammals more 



182 


JAMES E. WOMACK 


efficient, O’Brien et al. (1993) proposed a list of comparative mapping 
anchor loci (CMAL). These 321 Type I markers were selected to span 
the human genome at approximately 10 cM intervals and to include 
the anchor loci for human and mouse maps. Concerted efforts to map 
domestic animal homologs of these loci by syntenic, cytogenetic, and 
linkage approaches will greatly facilitate comparative mapping and 
the use of the growing human and mouse data base for animal im¬ 
provement. 

A breakthrough in comparative gene mapping is heterologous chro¬ 
mosomal painting, or ZOO-FISH painting. Rettenberger et al. (1995) 
painted the pig genome with 22 human autosome-specific libraries. 
Solinas-Toldo et al. (1995) have done the same experiments including 
the X chromosome in cattle. These studies delineate the boundaries of 
chromosomal conservation at the cytogenetic level and are thus far 
highly consistent with the results of comparative synteny mapping. 
Like synteny mapping, they do not address conservation of gene 
order. 


D. Economic Traits 

A number of traits of economic significance are beginning to appear 
on animal genome maps. In addition to bovine LAD (Shuster et al., 
1992; Threadgill et al., 1991), UMPS is also mapped to a specific site on 
chromosome 1 (Schwenger et al., 1993; Ryan et al., 1994; Barendse et 
al., 1993). BoLA is associated with susceptibility to leukemia virus 
infection (Lewin et al., 1988). Georges et al. (1993a) have linked the 
polled locus to microsatellites on chromosome 1. The weaver disease 
maps to markers on chromosome 4 (Georges et al., 1993b) and also 
appears to be associated with a quantitative trait for improved milk 
production. Variation around the prolatin gene on chromosome 23 
(Cowan et al., 1990) is related to milk production in certain Holstein 
sire families, and Georges et al. (1995) have used mapped micro¬ 
satellites to locate an additional five QTL for milk production. The 
Booroola fecundity gene (Montgomery et al., 1993) and a hyper mus¬ 
cling gene called callipyge (Cockett et al., 1994) have been mapped 
using the genomic approach to ETL mapping in sheep. Growth and 
fatness QTL were mapped in pigs in a cross involving the European 
wild boar (Andersson et al., 1994) and previously the candidate gene 
approach identified the genetic defect for porcine malignant hyperther¬ 
mia (Otsu et al., 1991). The number of ETL on animal genome maps is 
rapidly expanding. 



MAPPING ANIMAL GENOMES 


183 


V. Future Directions 

Sufficient numbers of mapped markers now exist in cattle, sheep, 
pigs, and chickens for extensive genome coverage of families segregat¬ 
ing ETL. Unfortunately, different ETL require different segregating 
families. These resource families are usually expensive to develop and 
maintain. Nonetheless, the families are an integral and necessary step 
in the ultimate application of the gene map to economic improvement. 
The mapping of ETL to 10-20 cM regions of a chromosome may be 
followed by high resolution mapping to ultimately identify and clone the 
responsible genes. Chromosome-specific libraries will aid this process. 

Despite the difficulties and expense of obtaining adequate resource 
family material, ETL will continue to appear on cattle linkage maps 
over the next few years. Animal breeders will not be satisfied with 
ETL-marker distance of 5-10 cM, however, and the next major step, 
identifying and cloning ETL, is a formidable one. The high-resolution 
linkage maps, chromosomal deletions, and large insert contigs that 
have contributed greatly to positional cloning of human disease loci are 
simply not available for animal ETL cloning. Positional cloning of hu¬ 
man genes is rapidly shifting toward the positional candidate (Collins, 
1995) approach, however, which relies on the availability of a pool of 
expressed genes mapped to the same chromosomal regions as the dis¬ 
ease gene. In livestock as in humans, the three-step process for posi¬ 
tional candidate cloning will be (1) to localize the ETL to a chromosom¬ 
al subregion, (2) to search data bases for reasonable candidate genes, 
and (3) to test the candidate gene for variation correlated with phe¬ 
notype. Obviously, step 2 is unrealistic in cattle where currently only 
400 of the 70,000 or so genes are even assigned to chromosomes. This 
step was almost as unrealistic in human genetics until very recently 
when several international initiatives targeted the large-scale map¬ 
ping of expressed sequence tags (ESTs). The success of these efforts 
suggest that more than half of the human transcripts could be mapped 
when this article is published (Collins, 1995). Thus, the key to animal 
ETL cloning may be through comparative genome data bases, which 
translate a 10-cM bovine segment into its human counterpart, then 
search for human ESTs with attractive features relative to the bovine 
phenotype. Assuming 20 or so potential candidate genes per centi- 
morgan, the work will then have only begun. Nonetheless, there is 
hope for ETL cloning through such a comparative positional candidate 
cloning strategy that is not available outside comparative mapping 
and extensive mapping of human transcripts. 
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Exploitation of human EST data bases for recovery of animal genes 
will require much more precise comparative maps than are currently 
available. Identifying the boundaries of conserved synteny will not be 
sufficient. We must identify breakpoints of internal rearrangements as 
well as the reciprocal rearrangements that have accompanied mam¬ 
malian chromosomal evolution. 
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I. Introduction 

Selected breeding to exploit physical attributes and behavioral skills 
has made the modern dog the most diverse species on earth. All dogs 
are members of the same species, and dogs of any breed can be crossed 
to produce fertile offspring. Yet, among the over 320 different breeds 
recognized by various kennel clubs, there exists a range of morphologi¬ 
cal variation that is unsurpassed among other species of animals. In 
contrast, because of the institution of breed standards, dogs within a 
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breed are relatively homogeneous. From a genetic viewpoint, this com¬ 
bination of uniformity and diversity is a tremendous asset. The unifor¬ 
mity ensures that nearly all dogs of a given breed wall share significant 
homogeneity at many loci in addition to those that are important to 
the breed standard. The diversity, however, ensures a wealth of op¬ 
portunities to genetically map and characterize genes responsible for 
differences. 

One unfortunate consequence of inbreeding is the maintenance of 
genetic diseases within many breeds. Autosomal recessive traits pre¬ 
sent the greatest problems. Asymptomatic carrier dogs produce appar¬ 
ently healthy puppies until they are crossed with another dog carrying 
the same disease-associated allele. Thus, carrier status may not be 
evident until several litters have been produced. Alternatively, unde¬ 
sirable traits may be maintained within a breed because symptoms do 
not develop until after reproductive age. Some diseases that fall into 
this category include various heart defects, epilepsy, and susceptibility 
to autoimmune disorders. 

Two other factors that complicate genetic mapping are the occur¬ 
rence of diseases that have variable penetrance and diseases caused by 
multiple genes. In the case of variable penetrance, affected dogs may 
display the trait at differing levels, with some dogs even appearing 
unaffected. As a result, apparently healthy dogs may be naively 
crossed, producing dogs that are later recognized as diseased. Traits 
prone to variable penetrance are often affected by environmental fac¬ 
tors such as nutrition and infectious diseases. Complex traits, by com¬ 
parison, are those that are controlled by several genes, each of which 
contributes partially to a final phenotype. The genetics of complex 
traits is sophisticated and requires pedigrees with large numbers of 
offspring and well-documented disease status. Even among the most 
diligent breeders, this information is often incomplete. 

Mapping both simple and complex diseases in dogs will become pos¬ 
sible through the development of a canine genetic map. A genetic map 
is composed of polymorphic genetic markers. When analyzed together 
with the disease status of members of a family, the genetic map func¬ 
tions as a series of molecular signposts that highlight regions of the 
genome likely to contain genes of interest. 

In the following sections, we briefly review our motivation to con¬ 
struct a canine genetic map, the organization of the canine genome, 
and the development and use of genetic markers in constructing the 
map. Finally, we conclude with a brief discussion of genetic registries 
and the use of genetic markers as diagnostic tools. 



THE CANINE GENOME 


193 


II. Applications of a Canine Genetic Map 

Purebred dogs are plagued with many inherited diseases. One major 
application of a canine genetic map will be to facilitate study of such 
diseases. This article is not an attempt to detail all canine diseases; 
other excellent resources exist for this purpose. These resources in¬ 
clude (1) Online Mendelian Inheritance in Animals (OMIA), (2) the 
Canine Genetic Disease Information System, and (3) Genetics of the 
Dog by Malcolm Willis (Willis, 1989). OMIA is a catalog of inherited 
disorders listing clinical and diagnostic features, providing summaries 
of current knowledge, and presenting a discussion of the presumed 
mode of inheritance for each disorder. It was developed by several 
investigators including Steve Brown and Paul LeTissier. The Canine 
Genetic Disease Information System, developed by Dr. Donald Patter¬ 
son and colleagues in the section of Medical Genetics at the Veterinary 
School of the University of Pennsylvania provides descriptive and clin¬ 
ical information about a wealth of diseases. Genetics of the Dog lists 
inherited disorders of the dog by system and includes comments on 
breed susceptibility. Other excellent resources exist, or are in prepara¬ 
tion, but the three listed here are sources of detailed and comprehen¬ 
sive information for anyone interested in canine inherited diseases. To 
date, the most extensively studied diseases are either homologs of 
human conditions or diseases that are widespread and threatening 
within particular breeds. Following are a few relevant examples. 

A. Canine Models for Human Genetic Diseases 

Many canine pedigrees with inherited diseases are studied because 
they are expected to provide useful data about similar diseases in 
humans, such as inherited neurological disorders. Among the most 
well studied is hereditary canine spinal muscular atrophy (HCSMA), 
which is similar to human spinal muscular atrophy, an autosomal 
recessive degenerative motor neuron disease. HCSMA was originally 
identified in Brittany Spaniels (Cork et al, 1979). The disease pro¬ 
duces progressive weakness and atrophy of muscle groups because of 
lower motor neuron loss. The course of illness depends on the particu¬ 
lar variant of the disease, but the illness is often fatal, usually as a 
result of the involvement of the muscles of respiration and swallow¬ 
ing. In dogs, the disease appears to be autosomal dominant, with 
variable penetrance. Dogs that are presumed homozygotes for the dis¬ 
ease locus have a profound phenotype, characterized by accelerated 
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disease, with weakness beginning between 6 and 8 weeks of age and 
progressing rapidly to complete tetraparesis by 13-16 weeks (Sack et 
al., 1984). 

Extensive pathologic studies on HCSMA showed that the disease 
recapitulates many aspects of human motor neuron diseases. Human 
spinal muscular atrophy (SMA) is complex, because there appear to be 
three forms with different ages of onset, each of which maps to a well- 
characterized region of chromosome 5. The causative gene (or genes) is 
likely one of several located in a region of generalized instability on the 
chromosome. The unstable nature of this region has made the genetics 
of human SMA difficult to study. Hence, studies have been initiated on 
the canine disorder. 

One additional example is provided by the work of Mignot and col¬ 
leagues, who have studied the genetics of canine narcolepsy, an autoso¬ 
mal recessive trait with full penetrance in Doberman pinschers and 
Labrador retrievers (Baker et al., 1982; Mignot et al., 1991). Unlike 
human narcolepsy, which appears to be polygenic, recent work from 
this group suggests that a single autosomal recessive gene, canarc-1, is 
responsible for the disorder. Canarc-1 is an immunoglobulin mu- 
switchlike gene. The identification of a non-Major Histocompatibility 
Complex (MHC) gene for this disorder is noteworthy because human 
narcolepsy has been linked thus far only to MHC gene class II alleles 
(Marcadet et al., 1985). Because the disorder in humans involves mul¬ 
tiple genes, the findings from Mignot and colleagues may suggest addi¬ 
tional strategies for researchers studying human narcolepsy. 

An understanding of the genetic basis of some diseases has raced 
ahead of that of other disorders, simply because of the simple patterns 
of inheritance. A well-studied example is an X-linked severe combined 
immunodeficiency syndrome called X-SCID. In dogs, this disease is 
similar to one form of human SCID and is characterized by profound 
defects in cellular and humoral immunity, resulting in a normal num¬ 
ber of circulating B cells and IgM, but low levels of serum IgG and IgA 
(Jezyk et al., 1989). In humans, the disease is associated with muta¬ 
tions in the gene for the gamma chain of the interleukin-2 receptor 
(Noguchi et al., 1993). Henthorn and colleagues have shown that muta¬ 
tions in the same gene are responsible for the canine disease in a 
colony of dogs established from a single X-linked SCID carrier female 
(Henthorn et al., 1994). More recently, a different mutation in the 
same gene has been found in a Welsh Corgi puppy presenting with 
classic symptoms of X-SCID, suggesting that different mutations in 
the same gene may account for the same disease in different dogs or 
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different breeds (Somber get al., 1995). Maintenance of this colony and 
cloning of the relevant genes offered a unique opportunity to character¬ 
ize this devastating disorder. 


B. Diseases of Interest to Specific Breeds 


The predisposition of various dog breeds to specific inherited disor¬ 
ders is well recorded. Some of the inherited diseases described in the 
literature are thought to be breed specific. For instance, collie eye 
anomaly (CEA) is believed to affect the rough and smooth collie, the 
Shetland sheepdog, and, rarely, the border collie. The disease has vari¬ 
able expression and confusion remains as to its true mode of inheri¬ 
tance. Affected dogs share a series of ocular defects that typically in¬ 
clude tortuous retinal arteries and veins, a pale area or specifically 
placed streaks on the fundus of the eye, and coloboma in the area near 
the optic nerve head (Roberts et al., 1966). The disorder ranges from 
slight to total blindness. The incidence of the different forms varies, 
but the disease is clearly problematic for the threatened breeds, and 
genetic studies are needed to facilitate testing. 

Other disorders, because of their high frequency, are nearly syn¬ 
onymous with the breed in which they were first reported. An example 
is Scottie or Scotch cramp. Although most common in Scottish terriers, 
it has also been reported in wire-haired fox terriers, cocker spaniels, 
and dalmatians (Meyers et al., 1970; Smythe, 1945; Woods, 1977). The 
disease appears to be autosomal and recessive; affected dogs undergo 
periods of muscular hypertension, which are relieved after resting. The 


dogs appear to neither be in pain nor lose consciousness. Similar dis¬ 
eases are reported in other breeds. Although the disease is not gener¬ 
ally life threatening, its unique characteristics and clear genetics have 
made it a focus of significant study. 

Perhaps the most devastating diseases are those for which the dog 
appears normal at birth and early life, and then displays a syndrome 
during mid-life. Epidemiological data exist, for instance, to suggest 
that particular forms of cancer are much more common in some breeds 
than others. Historically, the boxer breed is associated with a variety 
of types of tumors, including thyroid, cutaneous, nervous system, and 
mammary tumors (MacVean et al., 1978). Anecdotal evidence suggests 
other breeds also have genetic predisposition to particular tumors in¬ 
cluding melanomas in Scottish terriers (Cotchin, 1954; Cotchin, 1955), 
mammary tumors in cocker spaniels, and mastocytoma for the Boston 
terriers. 
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The genetic basis of these traits has been difficult to study because 
there are probably multiple genes involved. In general, the strategies 
necessary to study multi-gene traits are much more complex than 
those used to map the genes responsible for single-gene disorders. 
Mapping complex traits typically requires families that display the 
trait at its extreme levels, i.e. profoundly affected dogs as well as mild¬ 
ly affected dogs. Rigorous quantitative data regarding strength of phe¬ 
notype must be available for multiple generations of the family. Final¬ 
ly, a certain amount of luck is required. If a particular trait is 
controlled by a large number of genes, each of which contributes only 
minimally to the final complex phenotype, it will be difficult to dissect 
out the contributions of any one gene. If, however, the phenotype is 
controlled largely by a small number of genes, i.e. two or three, the 
genetics will be much easier to resolve. The highest level of resolution 
is achieved when a high resolution map is used to analyze rigorous 
quantitative data from several generations of a family. 

Complex diseases such as cancer, epilepsy, and hip dysplasia are 
most productively studied using a genetic map. A genetic map allows 
an investigator to navigate around the genome and, when properly 
developed and implemented, is a key resource for identifying regions of 
the genome involved in disease. The development of a canine map will 
enormously influence canine genetic research in the future. Because of 
the central role it will play, the organization of the canine genome and 
the construction of a canine genetic map are discussed in detail in the 
following section. 


III. Organization of the Canine Genome 


A. Chromosomes 

The dog has 38 pairs of autosomes, most of which are small, acro¬ 
centric chromosomes, plus an X and Y. Historically, however, it has 
been difficult to take advantage of the organization of the dog genome 
into so many discrete linkage groups because cytogenetic analysis of 
dog chromosomes is extremely difficult. A number of reports on band¬ 
ing of dog chromosomes were published in the mid-1970s, but they 
failed to agree on the banding patterns of many of the chromosomes, 
and numbering assignments were not consistent. As cytogenetic tech¬ 
nologies have improved, interpretation of Giemsa-banded dog chromo¬ 
somes has become much easier. Stone et al. (1991) have used cell syn¬ 
chronization techniques to define higher resolution banding patterns 
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at the 327 band level for canine chromosomes. The only breed differ¬ 
ence reported among the dogs examined was the degree of chromosome 
condensation. A higher resolution karyotype was reported by Graph- 
odatsky et al. (1995), who used GTG banding to define dog and silver 
fox chromosomes to the 400 band level. 

Langford et al. (1995) have used molecular genetic techniques to 
develop a set of chromosome '‘paints’’ specific for the canine genome, in 
collaboration with investigators from the Animal Health Trust in New¬ 
market and the Sanger Centre in Cambridge. To generate these paints, 
investigators used flow cytometry to separate the 38 canine chromo¬ 
somes into 32 reproducible and recognizable peaks, with most of the 
peaks representing one chromosome and a small number representing 
two. The deoxyribonucleic acid (DNA) from each peak was amplified 
and fluorescently labeled using polymerase chain reaction (PCR) and 
then hybridized to metaphase spreads of dog chromosomes. The la¬ 
beled products anneal specifically to the particular chromosome from 
which they were derived (hence the name paints) and are becoming an 
important resource with which to reproducibly distinguish one chro¬ 
mosome from another. As a result, chromosome numbers can now un¬ 
ambiguously be assigned to the 21 largest chromosomes (Langford et 
al., 1995), and it is expected that further development of additional 
reagents will allow all of the chromosomes to be definitively numbered. 

B. Repeated Elements in the Canine Genome 

The canine genome, like other mammalian genomes, is charac¬ 
terized by several classes of repetitive elements, one of which is the 
Jeffreys minisatellite probe. These probes are DNA sequences isolated 
from the human genome that have been shown to cross-hybridize to 
dog DNA to give individual specific fingerprints in dogs (Jeffreys and 
Morton, 1987, and references therein). The probes consist of a reiter¬ 
ated short core sequence that hybridizes to many families of tandemly 
repeated DNA sequences that show multiallelic length variations. The 
typical assay hybridizes radiolabeled core elements to DNA cleaved 
with restriction enzymes to produce a characteristic Southern blot fin¬ 
gerprint. With the exception of monozygotic twins, DNA fingerprints 
are absolutely individual-specific. These probes are useful to study 
migration of small isolated populations (Gilbert et al., 1990) and to 
resolve cases of questionable percentage. In one clear illustration of 
this phenomenon, Jeffreys and Morton (1987) demonstrated that DNA 
fingerprints of any two purebred whippets were no more similar than 
fingerprints derived from dogs of completely unrelated breeds. 
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IV. Development of a Canine Genetic Map 


A. Overview and Approach to Map Construction 

A genetic map consists of a series of genes or genetic markers for 
which the specific order and distances between them are known. A 
framework map is composed entirely of markers for which the order 
and spacing is very well supported and, although it may be of low 
density, is very reliable. This is in contrast to a comprehensive map 
that aims to include all syntenic markers. Although comprehensive 
maps are usually very dense, they may be locally unreliable with re¬ 
spect to both order and spacing. 

Maps are constructed by examining the inheritance of polymorphic 
alleles in the offspring of a specific mating. If two markers are on 
different chromosomes, their alleles will be inherited independently; if 
they are on the same chromosome, they may segregate nonrandomly 
and are said to be linked. Genes that are close together are usually 
inherited together, while genes that are far apart on the same chromo¬ 
some segregate independently because genetic recombination between 
chromosomes at meiosis disrupts their association. For a given area of 
the genome, the probability of a genetic recombination event occurring 
between a pair of markers is proportional to the distance between 
them. The probability of a genetic recombination event occurring is 
expressed as a recombination fraction or, after minor mathematical 
adjustment for the possibility of double recombinants, as centiMorgans 
(cM). One percent recombination is equal to approximately lcM, which 
roughly corresponds to a million base pairs in humans and just over 
two million in the mouse. 

To be useful, a genetic map should have markers placed at least 
every 10 cM. If the size of the dog genome is assumed to be similar to 
that of humans, such a map will consist of about 350 well-spaced mark¬ 
ers. The more densely the map is covered, however, the more useful it 
will be. The human genetic map, for instance, currently has over 6000 
markers, and is still considered incomplete (Murray et al., 1994; 
Weissenbach et al ., 1992). Increasing map density improves the preci¬ 
sion with which a disease region can be localized. A 10 cM map, for 
instance, will allow localization of a disease gene to within only 5 
million base pairs, whereas a map of 1 cM density, composed of 3000 
markers, will localize a gene to within half a million base pairs. 

The construction of a genetic map takes place in several sequential 
steps. First, large numbers of polymorphic genetic markers are iso¬ 
lated from the genome of interest. Second, the markers are placed into 
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linkage groups and, finally, the order of markers in a given linkage 
group is determined. Each of these steps is discussed in the next 
section. 


B. Microsatellite Repeats 

A genetic marker, by definition, must have two or more alleles; if the 
frequency of the most common allele is less than 95% the marker is 
described as polymorphic. One measure of the degree of polymorphism 
displayed by a given marker is the polymorphism information content 
(PIC) value (Botstein et al., 1980). The PIC value defines the proba¬ 
bility that for a particular marker, the genotype of a given offspring 
will allow deduction of which the parents’ two marker alleles it re¬ 
ceived. The PIC has a value of 0.0-1.0 and uses information about the 
number of alleles for a given marker in the population and the frequen¬ 
cy with which each allele appears. The ideal map is made up of large 
numbers of highly informative markers, typically with PIC values 
greater than 0.7. 

The most useful type of genetic markers are microsatellites. Micro¬ 
satellites are small stretches of DNA composed of small mono-, di-, tri¬ 
or tetranucleotide motifs repeated in tandem, such as (CA)/z, where n 
is typically 5-30 (Hamada and Kakunaga, 1982; Miesfeld et aL, 1981; 
Stallings et aL, 1991; Tautz and Renz, 1984). Microsatellites are poly¬ 
morphic for repeat number within populations, yet with an estimated 
mutation rate to new alleles of only 5 x 10" 4 tol0~ 5 per allele per 
meiosis in the human genome (Kwiatkowski et aL, 1992), are suffi¬ 
ciently stable to use as genetic markers of Mendelian inheritance (Litt 
and Luty, 1989; Weber and May, 1989). 

Microsatellites are found in large numbers in all eukaryote genomes 
and, equally importantly for map building, the distribution of these 
simple sequence repeats around the genome is approximately random. 
In both humans and dogs, for instance, a dinucleotide (CA) n repeat is 
found on average every 30-60 kilobases (kb) (Stallings et aL, 1991; 
Ostrander et al., 1993). Individual microsatellite markers are dis¬ 
tinguished from one another by PCR primers designed to match the 
unique DNA sequences that bracket each repeat (Fig. 1). The inheri¬ 
tance of individual alleles is easily measured by the PCR, which de¬ 
tects the length of polymorphisms that characterize different alleles 
(Litt and Luty, 1989; Weber and May, 1989). Microsatellite-based 
markers have been used to construct genetic maps of many organisms 
including human (Murray et al., 1994; Weissenbach et al., 1992), 
mouse (Dietrich et al., 1992), and cattle and pigs (Bishop et al., 1994; 
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Fig. 1. Oligonucleotide primers are designed to match the unique DNA sequences that bracket microsatellites. The inheritance of 
individual alleles is monitored by amplifying the DNA by the polymerase chain reaction (PCR), followed by fractionation on poly¬ 
acrylamide gels. 
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Ellegren et al., 1993; Fries et al., 1993; Toldo et al ., 1993). Efforts have 
been underway since 1992 to isolate large numbers of microsatellite- 
based markers for the canine genome (Ostrander et al., 1992; Os¬ 
trander et al., 1993; Holmes et al., 1993; Deschenes et al., 1994; 
Mellersh et al., 1994; Francisco et al., 1996). 

(CA)^ repeat-based microsatellites are typically isolated from short- 
insert genomic libraries screened with radioactively labeled (CA) n or 
(TG) n probes. DNA is isolated from positive clones, it is sequenced, and 
marker-specific PCR primers are designed to flank each repeat. The 
development of a strategy to construct genomic libraries highly en¬ 
riched for microsatellite repeats has been useful for the isolation of 
large numbers of microsatellite bearing clones (Ostrander et al., 1992). 

Although less frequent than (CA) n repeats, efforts in the human 
genome project have focused on the development of markers based on 
tetranucleotide repeats (Murray et al., 1994). Microsatellites based on 
tetranucleotide repeat motifs have been shown to give PCR products 
that lack the lower molecular weight “stutter” bands characteristic of 
some (CA) n repeats (Edwards et al., 1991; Edwards et al, 1992). In 
dogs, as in humans, a correlation has been demonstrated between the 
length of a microsatellite’s repeat array and its PIC value, with longer 
repeats being more polymorphic than shorter ones (Weber, 1990; Os¬ 
trander et al., 1993). In canines, (GAAA) n repeats are usually long, 
with many having repeat units greater than (GAAA) 50 . This particular 
class of tetranucleotide repeats has average PIC values of 0.75 (Fran¬ 
cisco et al., 1996). A (GAAA) n repeat occurs about once every 100-300 
kb in the canine genome (Francisco et al., 1996), only about threefold 
less than the frequency of (CA) n repeats, suggesting that these are 
sufficiently frequent and polymorphic to be a good resource for map 
building. 


C. Assignment of Genetic Markers to Chromosomes 

In the 1960s, techniques were developed to create rodent-human 
cell hybrids. These hybrids retain only a portion (typically one or two 
chromosomes) of the human genome together with a full complement 
of rodent chromosomes. It is possible to assign any human gene or 
DNA segment that can be distinguished from the rodent version, to the 
human chromosome contained within the hybrid cell line. Hybrid cell 
lines are used to assign new markers or genes to chromosomes by 
testing the marker against a comprehensive panel of cell lines in which 
each chromosome is represented at least once. For the canine genome, 
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a panel of canine-rodent hybrid cell lines would be of use for exactly 
this purpose. 

Hybrids also provide a mechanism with which to investigate the 
genomic organization of individual species and can be used to probe the 
relationships between genomes. As the numbers of genes mapped in 
organisms other than humans increases, the opportunity to study the 
syntenic relationships between chromosomes of different animals in¬ 
creases correspondingly. A set of anchored reference loci suitable for 
comparative mapping in mammals and other vertebrate classes has 
been proposed (O’Brien et al ., 1993). These loci are cloned mouse and 
human functional genes, spaced an average of 5-10 cM apart through¬ 
out their respective genomes. Importantly, the genes are evolu¬ 
tionary conserved and many are represented in the gene maps of 
other mammalian orders, particularly cattle and the domestic cat. So¬ 
matic cell hybrids can be used to assign anchored reference loci to 
syntenic groups in a particular animal, and then the linkage relation¬ 
ships of homologous genes can be compared between animals. Informa¬ 
tion regarding the conservation of synteny of homologous genes be¬ 
tween different organisms, as well as considerations of gene order 
within homologous gene segments, will then contribute to our under¬ 
standing of mammalian genome evolution. In addition, improved 
knowledge of conserved synteny will allow researchers investigating 
any disease to take full advantage of candidate gene approaches and 
explore the role of genes already characterized in other animals. 

D. Ordering of Markers on Chromosomes 

L Overview 

If two markers segregate consistently with a particular chromosome, 
they must be on the same chromosome and are said to be syntenic . 
Syntenic loci are not necessarily genetically linked; they may be so far 
apart on a given chromosome that genetic recombination between 
them approaches 50% and they segregate independently. Thus, while 
somatic cell hybrids can assign genes or DNA segments to particular 
chromosomes, they provide no information regarding the position or 
order of genes on chromosomes. 

The generation of a set of linearly arranged loci for which the genetic 
distance between adjacent markers has been determined produces a 
genetic map. The genetic distance between two markers is a measure 
of the expected number of crossovers occurring on a single chromosome 
strand between the two markers at meiosis. In meiosis (the cell divi¬ 
sion leading to gamete formation), homologous chromosomes pair up. 
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At this stage, each chromosome consists of two identical strands (chro¬ 
matids), so each chromosome pair consists of four strands. During 
meiosis, homologous chromosomes separate from each other except at 
one or two zones of contact, which are called chiasmata. Chiasmata 
involve one chromatid from each pair of homologous chromosomes and 
are the points at which there is formation of crossovers between chro¬ 
matids. This process leads to genetic recombination and can involve 
either of the two chromatids. 

The alleles received by an individual from one parent are called a 
haplotype; if the same alleles are passed from grandparent to F 2 off¬ 
spring the haplotype is described as parental or nonrecombinant. In 
Fig. 2, the father is heterozygous for both genes, having inherited the 
haplotype AB from his father and ab from his mother. If these two 
genes were inherited independently (as they would be if they were on 
different chromosomes), four types of gamete would be expected (AB, 
Ab, aB, and ab) in approximately equal proportions. If, in contrast, the 
genes were relatively close together on the same chromosome, this 
AaBb individual would be expected to produce an excess of the two 



Fig. 2. The alleles inherited by an individual from one parent are known as a haplot¬ 
ype; if the same alleles are passed from grandparent to F 2 offspring, the haplotype is 
parental. Linked genes give rise to an excess of parental gametes, whereas unlinked 
genes produce equal numbers of parental and nonparental gametes. Five of the six F 2 
offspring inherit the parental haplotypes AB or ah, whereas one offspring receives the 
nonparental, or recombinant Ab haplotype. 
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parental haplotypes (AB and ab ) over a smaller number of the non- 
parental or recombinant haplotypes (Ab and aB). Recombinant haplot¬ 
ypes are generated when a crossover occurs between two linked mark¬ 
ers (Fig. 3). 

Genetic recombination is essentially random, and no two sperm or 
eggs ever carry an identical combination of genetic information. There¬ 
fore, no two offspring ever inherit exactly the same combination of 
alleles, regardless of how many offspring a set of parents produce. 
Hence, it will always be possible to tell the offspring one from another, 
with the exception of monozygotic twins. 

2. Recombination Frequency and Map Distance 

The likelihood a crossover will occur between two markers on the 
same chromosome is proportional to the distance between them. The 
probability that a gamete or offspring will be recombinant is measured 
by the recombination frequency (0). Genes that are unlinked segregate 
independently, with a recombination frequency of 0.5. In contrast, 
genes for which there is a consistent deviation from the 1:1 ratio of 
recombinant to nonrecombinant offspring are said to be genetically 
linked, with 0 < 0.5. Crossovers cannot be seen; they are counted by 
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Fig. 3. Genetic crossovers involve one chromatid from each of a pair of homologous 
chromosomes. Recombinant haplotypes are generated when a crossover occurs between 
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observing the recombinant haplotypes of resulting offspring. In small 
intervals, when the probability of multiple crossovers is negligible, 
recombinations may be counted directly as crossovers. Then, the rela¬ 
tionship between the recombination fraction and the distance between 
two genes (x) is simply x = 0, (Morgan, 1928) and is generally true 
when 0 < 0.10. 

When many closely linked markers are available on a chromosome 
and their order is known, the simplest method of determining map 
distances between these markers is to estimate the recombination 
fractions in each interval of adjacent loci. The map distance between 
more distant markers is the sum of the map distances in the intervals 
between these loci (Sturtevant, 1913). This method can only be applied 
when a large number of markers are available, enabling a given inter¬ 
val effectively to be subdivided into small intervals. Recombination 
fractions are not additive over large distances because of the occur¬ 
rence of multiple crossovers that decrease the number of recombinant 
haplotypes observed and, in turn, the genetic distance. Various map¬ 
ping functions have been derived to convert a recombination fraction 
(0) into a map distance (x). For example, the widely used Kosambi 
function (Kosambi, 1944) 

1 + 20 

x = 1/2 tan _1 (20) = 1/4 In ^ __ ^ 

yields a map distance of 0.236 Morgans, or 23.6 cM, for a value of 
0 = 0 . 22 . 


3. Genetic Linkage and the LOD Score 

The process of counting numbers of recombinant and nonrecombinant 
offspring of a given mating followed by the estimation of map distance 
and determination of the likelihood of linkage versus nonlinkage is 
called genetic linkage analysis; this process is fundamental to the con¬ 
struction of genetic linkage maps. Recombination events can be de¬ 
tected only by observing haplotypes passed from parents to offspring so 
genetic linkage analysis and map construction cannot be carried out on 
unrelated individuals but requires observation of families. 

In linkage analysis between two loci, the likelihood for linkage (0 < 
0.5) versus the likelihood for free recombination (0 = 0.5) is calculated 
based on the number of observed recombinant and nonrecombinant 
offspring produced by a given mating. Conventionally, the logarithm of 
the likelihood ratio, the LOD score, 

Z(0) = log 10 [L(0)/L(O.5)1, 
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is used as the measure of support for linkage. For example, if n obser¬ 
vations consist of k recombinants and n - k nonrecombinants, the 
corresponding LOD score is given by 

7 (b\ = 71 l°g(2) + klog (q) + {n - k) log (1 - q) if 6 > 0 
w n log(2) if 0 = 0. 

A linkage group is a set of markers in which each marker is linked to 
at least one other marker with a specified LOD score. Generally, link¬ 
age is accepted if a recombination fraction of 0 < 0.5 is supported by a 
LOD score of at least 3.0, reflecting an odds ratio of 1000:1. 

In the direct method of linkage analysis, recombinant and nonrecom¬ 
binant offspring of a particular mating are observed directly. However, 
for recombinants to be distinguishable from nonrecombinants the pa¬ 
rental and nonparental haplotypes must be known. In principal, a 
doubly heterozygous individual ( AaBa ) could have received the A allele 
in coupling with either the B or the b allele from one parent. These two 
possibilities are known as the two phases of a double heterozygote. In a 
two-generation family, the phase of the parents is unknown; therefore, 
to be able to count recombinants directly it is essential to have ge¬ 
notype information from three generations. 

The genotype of each individual is recorded and the recombination 
frequency of every pairwise combination of markers calculated. A par¬ 
ent must be heterozygous for a given pair of markers for it to be infor¬ 
mative for linkage for those markers. When, for example, a parent with 
the genotype Ab / ab produces the haplotype Ab it is impossible to know 
if a recombination has occurred because both recombinant and non¬ 
recombinant haplotypes can have this genotype. A mating is, there¬ 
fore, only informative for linkage between a pair of given markers if at 
least one of the parents is heterozygous for both markers. 

4. Canine Families for Mapping 

The resolution and integrity of a genetic map depends directly on the 
number of informative meioses available to be typed with each marker. 
Consequently, genetic maps are constructed with genotyping data 
from large, three-generation families. The human genetic map has 
been constructed with a panel of over 40 three-generation families 
known as the CEPH (Centre d’Etude du Polymorphisme Humain) ref¬ 
erence families (Dausset et al, 1990). These families were chosen be¬ 
cause both parents and most of the grandparents were alive. In addi¬ 
tion, each of the CEPH families has a large number of offspring. Cells 
from all members of each family have since been immortalized. 

The construction of genetic maps of other animals has been possible 
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because of the commitment to generate very large purpose-built ped¬ 
igrees by crossing genetically diverse strains or breeds. For example, 
investigators at the Whitehead Institute/MIT Center for Genome Re¬ 
search have constructed a mouse genetic map by genotyping over four 
thousand polymorphic microsatellites in 46 F 2 intercross progeny from 
a cross between two strains of mice known as the OB and CAST 
strains. The map provides an average spacing between markers of 0.35 
cM, which corresponds to about 750 kb in the mouse (Dietrich et aL, 
1994). 

The pig genetic map is being constructed using a three-generation 
pedigree generated by crossing two European wild boars with eight 
sows of the domesticated large white breed. The subsequent intercross 
of 26 F x animals has generated 200 F 2 individuals. This cross was 
chosen because the founder lines exhibit considerable phenotypic as 
well as genotypic variation (Ellegren et aL, 1993), increasing the 
chances the pedigree would be informative for any given marker. 

The very high levels of inbreeding associated with most breeds of dog 
presents the dog mapping project with a challenge. Any single dog 
family is only informative for a subset of markers. A panel of several 
unrelated families, representing different breeds, is therefore neces¬ 
sary. The informativeness of this panel is further maximized by avoid¬ 
ing families with inbreeding loops. The distance over which linkage 
between a pair of markers can be confirmed, with a LOD score of at 
least 3.0, depends on the number of meioses available that are infor¬ 
mative for both markers. Whereas only 10 meioses are needed to con¬ 
firm that two markers are linked if there are no recombinants, data 
from 100 meioses are necessary to support linkage over a distance of 31 
cM, illustrating the importance of large families for linkage analysis. 
Ultimately, the resolution and integrity of the dog genetic map, and its 
eventual value to breeders and the scientific community, will depend 
on appropriate pedigrees being made available to scientists construct¬ 
ing the map. 

Placing the markers into linkage groups and then determining the 
order and spacing of the markers requires genotyping each family 
member with every marker for which that family is informative. Data 
from the genotyping of each new marker must be analyzed together 
with that from all previous markers in the region. Although there are a 
variety of specifically designed computer programs that can calculate 
recombination frequencies and determine the likelihood of linkage, 
such as the LINKAGE program package (Lathrop et aL, 1984), MAP- 
MAKER (Lander and Green, 1987), and Multimap (Matise et aL, 
1994), the process of building a genetic map remains a large and ex- 
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pensive task that takes the coordinated efforts of several laboratories. 
Ongoing efforts to share and coordinate family collection and genotyp- 
ing data will greatly reduce the overall effort and cost necessary to 
build the map. 

5. Analysis of Markers for Linkage to Disease Genes 

In the mapping of disease genes, one uses the same strategies just 
described but looks for linkage between a marker and a disease state, 
rather than between two markers. Efforts to map canine disease genes 
are best illustrated by the work of Yuzbasiyan-Gurkan and collabora¬ 
tors at Michigan State University and their studies of canine copper 
toxicosis. Canine copper toxicosis is an autosomal recessive disorder 
that is prevelant among a subset of breeds, including Bedlington terri¬ 
ers, in which the gene frequency is estimated to be about 50%. Previous 
work by the same laboratory had established that the canine disease is 
similar to a human disorder called Wilson’s disease, in which affected 
individuals suffer from progressive hepatic disease because of cooper 
accumulation in the liver (Yuzbasiyan-Gurkan et al. t 1993). In hu¬ 
mans, the Wilson’s disease gene has been mapped to chromosome 13q 
and is closely linked to the esterase D and retinoblastoma genes. In 
dogs, these two loci are not closely linked and appear unlinked to 
copper toxicosis (Yuzbasiyan-Gurkan et al. f 1993). In a more recent 
study, Yuzbasiyan-Gurkan et al. (1996) describe a microsatellite mark¬ 
er that yields a LOD score of 5.96 at a recombination fraction of 0, 
which indicates that the marker is close to the copper toxicosis gene. It 
is important to make the distinction between cases in which genes of 
interest have actually been cloned, such as those described in Section 
V, and cases in which the actual gene remains unidentified, but a 
closely positioned marker has been identified. In the case of canine 
copper toxicosis, the gene itself remains unidentified, but the marker is 
sufficiently close to the gene that it is suitable for diagnostic use. 


V. Targeted Cloning of Canine Genes 

A. Genes of Interest 

Simultaneous with the analysis of canine pedigrees for markers 
linked to disease, efforts are ongoing to directly clone canine genes that 
may be important in disease. In most cases, these genes are identified 
by taking advantage of the extensive similarity in DNA sequence that 
exists in coding regions between mammals. Genes likely to be impor- 
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tant in canine cancers have been of particular interest. For instance, 
the evolutionarily conserved part of the canine homolog of the p53 gene 
has been cloned and sequenced. Studies reveal that somatic mutations 
in p53 are important in canine thyroid cancers (Devilee et al., 1994). 
The canine transforming growth factor-beta-1 gene, tumor necrosis 
factor (TNF) alpha, and c-yes proto-oncogene have also been cloned 
(Manning et al., 1995; Zhao et al ., 1995; Zucker et al., 1994). As with 
other genes, the level of homology is generally high. The canine TNF- 
alpha gene, for instance, displays 90% sequence homology to the corre¬ 
sponding human TNF-alpha cDNA (Zucker et al., 1994). Finally, the 
canine BRCA1 homolog has been cloned and sequenced (Szabo et al., 
1996). The gene is highly homologous to the human gene, which has 
been shown to be a major cause of familial breast cancer in humans 
(Hall et al., 1990; Miki et al., 1994). Its relative importance in canine 
breast cancer, however, remains to be tested. 

Several genes have been cloned in an effort to learn more about 
specific canine diseases. For instance, the gene for canine factor IX was 
cloned by Evans et al. (1989). This gene possesses 86% identity at the 
amino acid level with the human counterpart and is likely to be impor¬ 
tant in canine hemophilia B. In other studies, the canine homolog rod- 
opsin gene, which is associated with several forms of human retinitis 
pigmentosa was cloned (Petersen-Jones et al., 1994). Because many 
dog breeds suffer from apparently hereditary progressive retinopa¬ 
thies that are similar to retinitis pigmentosum, this gene is a likely 
candidate for inherited canine eye diseases, although this remains to 
be shown. The description of a male-specific progressive retinal atro¬ 
phy (PRA) syndrome in Siberian huskies by Acland et al. (1994), sug¬ 
gests that multiple genes are important in canine PRA, at least one of 
which is on the X chromosome. 

Analysis of several other genes has been useful for studies compar¬ 
ing gene functions between animals. For instance, cloning and se¬ 
quencing of the canine insulin gene predicts that the canine preproin¬ 
sulin molecule contains an additional C-peptide fragment of 31 amino 
acids that is not observed in humans. The role of this protein remains 
unclear, but ongoing studies are expected to provide further insight 
into the mechanism by which the protein works (Kwok et al., 1983). 

B. The Canine Immune and Hematopoietic Systems 

Another class of genes that have been the target of several studies 
are those involved in the canine immune system. Differences in the 
canine immune system are likely to account for the differential suscep- 
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tibility to disease and autoimmune disorders that is noted between 
breeds of dog and can serve as models for some human diseases. The 
dog major histocompatibility locus (DLA) is among the most well-stud¬ 
ied canine gene families. The DLA is divided into three classes of 
molecules: la, lb, and II. Class I genes are expressed on most tissues 
and cells and encode membrane glycoproteins. Burnett and Geraghty 
(1995) have studied the structure and expression of a canine class I 
gene, DLA-79. The limited polymorphism, low messenger ribonucleic 
acid (mRNA) expression, and divergent structure of this gene suggest 
that it is an analog of the major histocompatibility complex (MHC) lb 
genes in humans and rodents. 

The canine class II genes encode heterodimeric glycoproteins con¬ 
sisting of an alpha and a beta chain that are involved in the control of 
the immune response and antigen presentation. Sarmiento et al. 
(1990) first reported the isolation and characterization of a canine class 
II gene that appeared to belong to the canine DRB family. Since then, 
Wagner et al. (1995, 1996a, 1996b, 1996c) have reported on the poly¬ 
morphism of canine DRB genes as well as isolation of DRA and DQA 
genes. As in other mammalian species, these genes are highly homolo¬ 
gous to the human and murine counterparts. 

In addition to work on the DLA system just described, several other 
canine genes have been cloned and studied in an effort to better under¬ 
stand the canine hematopoietic and immune system. These include 
hematopoietic growth factors and cytokines such as the stem cell fac¬ 
tor, IL-8, interferon, and GM-CSF genes. The canine CD34 gene, which 
encodes a protein by which progenitor stem cells are recognized, has 
also been cloned and sequenced (McSweeney et al ., 1996). Although the 
primary interest in these genes stems from a desire to better use the 
dog as a model for bone marrow transplantation, the cloning and se¬ 
quencing of these genes are a useful measure of homology among the 
human, canine, and murine genomes. 


VI. Considerations and Conclusions 

The identification of pedigrees suitable for genetic mapping and the 
generation of a genetic map is not an end in itself. The map is simply a 
tool to accelerate the rapid positional assignment of inherited traits as 
an initial step toward gene identification and characterization. Al¬ 
though a real possibility, the treatment of dogs affected with inherited 
disease is many years in the future; the ability to recognize phe- 
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notypically healthy dogs carrying genetic diseases, however, is already 
a reality for a small number of diseases. As the map is assembled, 
genetic markers linked to inherited diseases will be identified and the 
list of diseases for which carrier detection is possible will grow very 
rapidly. 

The ability to identify dogs that carry the gene for an inherited 
disease is exciting. It provides the means with which to control that 
disease in the population by appropriate breeding programs. However, 
there are important issues associated with the ability to recognize 
carriers that must be considered by breeders, kennel clubs, and veter¬ 
inarians in this new era in canine genetics. For instance, the develop¬ 
ment of a registry of dogs tested, and their carrier status, should be 
considered. Such a resource would play a central role in implementing 
breeding programs to control or eliminate diseases for which a genetic 
test is available, and it would provide a source of information for con¬ 
sumers concerned with buying a healthy dog. A related issue is that of 
confidentiality. For instance, should it be a requirement of kennel club 
registration for puppies to be tested for a disease known to be a prob¬ 
lem for their particular breed? In addition, once a genetic test becomes 
available, should it be compulsory for test results to be entered in the 
registry? 

As each new genetic test becomes available, a major consideration 
will be deciding which breeds should be tested. Disease alleles will 
exist in different breeds at varying frequencies and genetic testing is 
expensive. Therefore, mechanisms must be established to guide breed 
clubs as they decide how common a disease need be before genetic 
testing is required. 

The biggest problem currently facing breeders is that dogs carrying 
recessive inherited disorders are usually indistinguishable from non¬ 
carriers; only when they have produced affected offspring is their carri¬ 
er status recognized. The ability to identify dogs as carriers as soon as 
they are born means choices can be made as to whether carriers are 
permitted to breed. Crossing a carrier with an unaffected dog will 
produce approximately 50% carrier offspring but will never produce 
affected dogs. Breeders and veterinarians will need to seek advice from 
geneticists regarding the best way to eliminate each particular dis¬ 
ease. Breeders need to decide whether merely preventing the birth of 
affected dogs is satisfactory; the elimination of disease alleles from a 
population would require that only noncarriers be crossed. Stipula¬ 
tions of this nature would obviously reduce the number of dogs avail¬ 
able for breeding and may mean that dogs carrying other very desir- 
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able traits will not be bred. Breeders and breed clubs need to come to 
terms with any compromises in the breed standard that may result 
from this practice. 

Finally, it is important for these issues to be addressed now. Breed¬ 
ers, veterinarians, and kennel clubs must prepare for the dilemmas 
that genetic testing will present. Dog breeders, long the advocates for 
their dogs, must be willing to educate themselves so they may contrib¬ 
ute in a positive way as these decisions are debated in the coming 
years. Clearly, the potential to eliminate the inherited diseases that 
plague so many breeds cannot be ignored, but it will be hard to turn 
away from short-term losses and keep focused on long-term goals. The 
desire to have happier, healthier, more long-lived animals must remain 
the long-term goal. Toward that goal, each breeder should be prepared 
to take full advantage of the opportunities provided by genetic testing 
and the development of a canine genetic map. 
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